# [EM] A reduction for a generalization of median ratings/MJ

Kristofer Munsterhjelm km_elmet at t-online.de
Tue Dec 2 04:03:17 PST 2014

```Consider an election with rated ballots, v voters, and c candidates, and
let r_{a, b} be the ath voter's rating of the bth candidate.
Furthermore, let p_1...p_n be an enumeration of the candidates and
voters, and say p_a "is a candidate" if it corresponds to a candidate,
and correspondingly for "is a voter".

Define a "distance" d_xy (from x to y) as
infinite if p_x is a candidate or both of p_x and p_y are voters,
-r_{a, b} otherwise (if p_x is the ath voter and p_y is the bth candidate)

Now, the following generalization of MJ springs to mind:

Given some integer k >= 1, pick a set S of k points so that the sum of
distances from all points not in S to their closest point in S is minimized.

(If k = 1, S consists of the point corresponding to the candidate with
greatest median rating.)

Then elect the candidates corresponding to the k points picked, and give
each of them weight equal to the number of voters who are closer to him
than to any other. (Tiebreaks are left as a detail.)

But the problem above is just the k-median problem! (

It's NP-hard, at least for continuous distances. Exact solvers don't
seem to blow up that horribly, though. There's an exact solver with
complexity O(n^sqrt(k)) for a minmax variant of the problem (
https://en.wikipedia.org/wiki/Facility_location_problem#Minimax_facility_location
). Perhaps one could make a similar complexity solver for the minsum
variant (i.e. the one above). If so, it would be manageable for
realistic numbers of candidates or parties.

I also imagine one could use greedy approximations to the problem to
solve it in polytime. The k-median problem is solved much more easily if
d gives a metric, but it should be possible to reduce to set cover
even if not and use the greedy (simple) algorithm for set cover there.
These approaches would, like Tideman is to Kemeny, be methods of their
own, and would have to be analyzed thus. Finally, the minmax variant
also reduces to MJ (absent tiebreaks) in the 1-seat case, and so could
possibly be used.

---

What would that give us? Applying it directly would give a weighted
candidate method, and one that doesn't seem to have the first preference
bias of methods based on IRV. The approximation methods might have
biases of their own, of course.

Applying it to party list also seems pretty obvious at first. Say the
voters rate/grade parties the way one would candidates in MJ. Then, to
solve, just set k very large (or solve the uncapacitated facility
location problem, which has no such limit k) and get the party weights.
Then feed those to Webster's algorithm to get seat counts for each party.

However, that could leave some parties with zero seats. One way of
handling this would be by elimination of those parties and running the
method again. That's the IRV approach, and probably would lead to or
increase nonmonotonicity. Another approach is more Bucklinesque. Start
with k=1 and increase until one of the parties get zero seats. Pick the
greatest k that gives every party at least one seat, and go with that.
It's not perfect, but probably the best one can do short of an integer
program (which would be rather complex). Furthermore, one would have
to prove that "early factionalist lock-in" wouldn't happen. Say that
there are a few voters that give their party max and every other party
min. Then perhaps these would, due to their uncompromising nature, would
get the fringe party a nonzero weight at low k. In that case, the party
would get zero seats, but in doing so, would bar other minor parties
that could appear later on. But perhaps this can't happen because the
fringe party is always given support after less fringe parties.

In any case, the Bucklin approach would appear to handle an LCR
equivalent the way we'd expect it to, e.g.:

46: L(Excellent, 3) C(Good, 2) R(Poor, 1)
43: R(Excellent, 3) C(Good, 2) L(Poor, 1)
11: C(Excellent, 3) R(Good, 2) L(Poor, 1)

If there's only one seat, then k=1 will give C one seat, but k=2 will
give R and L less than a seat each, so k=1 is chosen and C wins. But
with two seats, k=2 gives R and L one seat each, as desired.

For an ordinary (unweighted) candidate multiwinner method, one would
need a capacitated version of the k-median problem. Set k = number of
seats and the max capacity to ceil((num voters)/(num seats)). Then, by
pigeonhole, k candidates will be closer to one quota's worth of weight
than to any other number of quotas' worth, and every other candidate
will have weight zero.

Capacitated k-median could also be used to set an upper limit to the
weight of any candidate in a weighted method, e.g. a 10% limit mentioned
elsewhere.

---

To conclude, though, there are many approximation algorithms for the
k-median problem. Some of these might be simple enough, fast enough, and
close enough to the exact result that they could be used. They could
then be used for weighted candidate methods, or might be combined with
the Bucklin approach to create a party list method that, although not
perfect, would be considerably better than plurality-based party list
PR. (Whether it is possible depends on whether the factionalist lock-in
scenario is a real problem.)

---

 I'd imagine it to be NP-hard for discrete ones, too, by reduction
from set cover.

 At first glance, the distance seems to obey the triangle inequality,
simply because of the infinities. To make d fully metric, we would need
to relax it to the following:
d_xy is
infinite if both of p_x and p_y are voters or candidates,
-r{a, b} otherwise (if p_x is the ath voter and p_y the bth candidate
or vice versa)
But that raises the possibility that the method would pick from the
voters rather than the candidates. This could, I think, be avoided by
adding a large number of "phantom voters" that vote every candidate
equally so that v >> c.

 The problem is that ineffective votes can happen here as well, due
to how the quantization is not connected to the rest of the method. Say
that a bunch of left parties get slightly short of two seats each, and a
major right-wing party takes the remaining seats. Had the left voters
known this, they could have moved the excess from each party into some
other left-wing party and taken at least one seat from the right-wing;
and so, we'd like the method to do that automatically. But the above
won't do so because the weighted vote stage has no idea of how the
quantization by rounding works. (Statistical Condorcet will do the right
thing, though in a ranked setting; and until I've optimized it, it's
intractable for large party list elections.)
```