[EM] Feature extraction and criteria for multiwinner elections

Sun Jan 4 04:47:54 PST 2009

Raph Frank wrote:
> On Sat, Jan 3, 2009 at 1:40 PM, Kristofer Munsterhjelm
> <km-elmet at broadpark.no> wrote:
>> A (seemingly) reasonable generalization of the Euclidean distance Voronoi
>> would be this: For each point, find the two candidate points so that the sum
>> of the distances to those two points are minimized. Color p according to the
>> composite color of the k closest candidates (for a (k,n) election). But
>> doesn't that correspond to the election method where you elect the CW, then
>> remove him and elect the next CW and continue like that until done? That
>> method is not PR.
> 
> One option would be
> 
> for each possible set of N winners
> - find the average distance from each voter to the closest winner
> 
> The 'best' winning set is the one that minimises this average distance.

That suggests a rated voting method: For each possible set, each voter 
contributes a score equal to that voter's rating of the highest rated 
candidate that's in the set. Greatest sum wins.

Using a closest winner metric has a kind of obscuration/concealment 
effect, though. Consider the following very unrealistic condition:

The voters are all either Republicans or Democrats. That is, on the 
voting line, every voter is either at the R position or the D position. 
Ten seats to be filled, ten Democrat candidates and ten Republican 
candidates.

Say the voting population is split 50-50. Then, as long as there's at 
least one Republican and one Democrat in the outcome set, the rest of 
the set is undefined, since the Democrat voters will measure the 
distance to the one Democrat ensured to be elected, and the Republicans 
will measure the distance to the one Republican ensured to be elected.

So let's expand on this a little: say there are three positions: A, B, 
and C. 49% support A. 49% support B, 2% support C. Ten candidates at 
each point stand for election - call them A1, A2, etc for A, and 
analogous for B and C. Then any outcome that has at least one A, B, and 
C, will be considered optimal, even something like

A1, B1, C1, C2, C3, C4, C5, C6, C7, C8,

which is extremely disproportional. In effect, A1 "cloaks" the 
disproportionality of the rest of the council to the A voters, and B1 
does the same to the B voters.

But note that all of these "cloaked" councils have the same scores (or 
distances), so it might be that those are still preferable to the rest, 
but where some of those with same distance are preferable to others. I'm 
not sure if that's the case.

In any event, the "obvious" way of continuing is to somehow give weight 
to candidates farther away. There are two simple ways of doing this. The 
first is PAV style: weight the distance to the closest by f(1), the 
distance to the secondmost by f(2) and so on, where f(x) = 1/x for 
D'Hondtian PAV; but the nature of f seems pretty arbitrary.

The second is to peel off, STV style. Let's take a less extreme version 
of the above example - four to be elected and

A1, B1, C1, C2

as proposed council. Normalize voter points: The A voters are 49, the B 
voters are 49, and the C voters are 2. Each candidate has a count of 25. 
Then, until done, pick a random voter, add the distance to the closest 
candidate, and decrement both his and the candidate's count. Remove 
candidates that reach zero.

What'll happen? The A voters will erode A1 (49 against 25, so A1 
disappears and the A voters' count is 24). Meanwhile, B voters erode B1 
in the same way. C voters disappear before either C1 or C2 does, so at 
the end, you have

A voters: count 24
B voters: count 24

C1 and C2 remain, and they will give very long distances against the 
remaining A and B voters before they're eroded.

But like PAV, this has its problems. From STV, we know that the order 
the candidates are eliminated counts, therefore which voters get picked 
also matter here. We can somewhat get around it by running it with 
random pairing an extremely large number of times, but it's kinda iffy 
still.

> This means that the size of the Gaussian would matter.  If all the
> voters were concentrated at a single point, then the winner would just
> be the double CW like you said.
> 
> However, if there is a large spacing between the voters, then it the
> effect would be to elect candidates closer to the edges.

I'm not so sure. One extreme of the Gaussian is to have all at the same 
position. In that case, the two CWs win. In the other case, you have a 
nearly uniform distribution, tapering only very slightly. Say that it's 
at 0.3. Then Center is obvious. We're left with the choice between Left 
(<-0.5) and Right (>0.5). Because the center is > 0, ever so slightly 
more will be close to Right than Left. Hence, the double CWs win again.

>> 46: Left > Center > Right
>> 46: Right > Center > Left
>>  8: Center > Left > Right
>>
>> which should elect Center in a single-winner election, but Left and Right in
>> a multiwinner one?
> 
> Yes, I think this is perfectly reasonable.  Centre is a compromise
> between all the voters.  However, if there are 2 seats, then each
> faction should be allowed to pick its own winner.
> 
> Left + Right means that 92% of the voters get their top choice
> elected.  Which is better than Centre + Right or Centre + Left.

Yes, I agree. The question was just a continuation of the line above "So 
how does that give with our more radical example?". The answer is, as I 
gave there, that the radical example can't be constructed from a single 
Gaussian.