[EM] Proportional representation methods with very many candidates
Kristofer Munsterhjelm
km_elmet at t-online.de
Tue Dec 3 04:49:00 PST 2013
Recently, I've been fiddling a bit with recommender systems, and I have
noticed what is often a problem with these systems: they recommend very
many things of the same sort.
That is, if it detects that a person likes action movies, it may
recommend a whole slate of them. This may even be correct in the respect
that the user would like any randomly picked action movie ranked highly
by the system more than the non-action movies that come later. But if he
were to watch all of the movies from most recommended to least, he would
grow tired very quickly.
I noted that in political terms, this is similar to the cloning problem
you get when using a good majoritarian method (say, Condorcet) to fill a
council. If the majority-preferred candidate has many clones, then all
of that candidate's clones will fill the council, because the method
(correctly) judges that the majority prefers each clone to the other
candidates.
How can we deal with that in a political context? We can do so by using
PR. And this suggests that one could also do something similar in a
recommender system context.
That is, consider a cosine distance system: "you're likely to like what
others that liked what you did also like". Then each user has a weighted
vote corresponding to the similarity of sets (how much he likes what you
likes), and the recommender system works by running a weighted Approval
or Range election.
Which brings me to the point of my post (I tend to write long
introductions!). In such systems you can easily have thousands or
millions of "candidates". Which method should we choose if we want a PR
election in order to provide diversity? The sheer number poses problems
for many of the known systems.
For one, because there are so many "candidates", most of them will be
tied in a Plurality elimination stage like the one in STV. So
elimination will be more or less random at first, until one gets down to
sufficiently few candidates that the votes can accumulate. So ordinary
STV doesn't seem to be all that good.
So how about the virtual contest or combinatorial methods that avoid
this problem? Both virtual contest methods (like Schulze STV or CPO-STV)
and combinatorial methods (like PAV) don't scale well enough. And
sequential ones (PRV, sequential PAV, etc) fail setwise election criteria.
Thus my question: Can you think of any method that would work well in
this setting?
The only ones I can think of at the moment are STV-ME and a quadratic
STV method I don't remember the details of. The former, by its
"elimination election" by Condorcet among the losers, should have a
greater chance of telling different zero-Plurality-votes candidates
apart. And the quadratic STV method was a patch to STV that would let
one use any positional method possible while still retaining the quota
guarantees of STV. Again, that would make it easier to tell losers apart.
There is of course the compromise solution of just using sequential PAV
or proportional Range voting. In "worse-is-better" style, it wouldn't
care very much about theoretical guarantees as long as the method is
good enough. These sequential methods could also be brought closer to
the combinatorial ones by considering chunks of k candidates at a time,
with the greatest k that is still practical for the available processing
power.
More information about the Election-Methods
mailing list