[EM] Another proportionality metric for multiwinner elections, and its optimal Yee diagram

Thu Sep 12 14:04:07 PDT 2013

After writing the post on improving the Sainte-Laguë index, I started 
wondering about what the PR problem would look like, phrased 
geometrically. And I think I found one. It's a bit different from what 
Warren proposed years ago, but it has the advantage that the problem for 
party list PR and for individual PR is very similar.

Let's take the individual PR problem first:

We're given a set c of C candidate points and a set v of N voter points. 
We're also given a number S (number of seats) so that N mod S = 0. The 
output is a subset e of c, so that there are S points in e.

Define a link between a point in c and a point in v as a graph edge with 
weight equal to the distance (in some geometric space) between the points.

Define a mapping between e and v to be a graph where each member in v 
has a link to some member in e, and each member in e has the same degree 
(number of links).

Define a minimum mapping between e and v (for a given subset e) to be a 
graph where the links between v and e are set to minimize the sum of 
distances.

Define the minimum mapping sum between e and v to be the sum of 
distances for the minimum mapping between e and v.

Then we wish to find an e so that the minimum mapping sum between e and 
v is minimized. (Side note: this is very similar to Monroe's method.)

---

For party list PR, the problem is very similar, except that we're 
permitted to "clone" any point in c. The output is no longer a proper 
set, but can contain one or more copies of each point.

---

The intuition here is of having a city redistribution program. Each city 
can take the same number of people, and we want to find a number of 
cities so that, when we allocate each person to one of the cities, the 
sum of the distances they have to travel is minimized. More generally, 
each group of voters is of the same proportion (that's the 
"proportional" part) and each group of voters is represented most 
accurately by minimizing distance (that's the "representation" part).

When you turn this into a single-winner method, you get the problem of 
finding a city so that, if you relocate everybody there, they have to 
move the least distance. This would, in a Yee diagram, resolve to the 
Voronoi result of an optimal method.

So how does this relate to Yee diagrams in general? Well, I once tried 
to make Yee diagrams for multiwinner methods. These seemed to exhibit 
some pattern, being similar to Voronoi maps when the voters were 
concentrated around a single point, but giving more space to centrists 
when the standard deviation is increased. But I didn't know what I was 
looking for. The individual-candidate PR methods were either not 
proportional or not monotone.

But with the reduction above, I *do* know what to look for. And since in 
a Yee diagram, all distances are known, I can then make optimal Yee 
diagrams for the multiwinner case and find out what they look like. All 
I have to make sure of is that N mod S = 0 since I don't know how to 
generalize it if not. (Perhaps one could consider each voter to be a 
real thing and so infinitely subdividable.)

Another interesting part is that when every voter is also a candidate, 
this reduces to Warren's clustering problem. With optimal clustering, 
each cluster would have as many voters attached to it as every other 
cluster.

---

Finally, this suggests a Monroe-type algorithm for rated party list PR. 
Instead of minimizing distances, maximize scores. Because each party can 
be cloned, we would only need to check every combination of parties. The 
only thing left would be to ensure it would reduce to Sainte-Laguë when 
everybody max-rates a single party and min-rates the others...

To generalize to rankings would be harder. One might use the Kendall-tau 
distance, but I don't know how good that would be at optimizing the 
hidden distances behind the rankings (assuming each citizen ranks 
closest cities first). It would also have the problem that I encountered 
while making CFC-Kemeny -- that you'd need a ranking, not just a single 
candidate or set, per cluster.