[Election-Methods] Determining representativeness of multiwinner methods

Fri Jun 20 15:09:03 PDT 2008

Hello all,

(says the newcomer.)

I think I have found a metric for comparing multiwinner systems, at 
least as these pertain to proportional representation, when all votes 
are honest.

The advantage of the metric is that, if what it measures is desirable, 
it gives an idea of how good the system performs - how representative it 
is - and thus its best case performance. In contrast, criterion failure 
shows how bad a system can get in the worst case.

The broad idea is this: The most proportional assembly is the one which 
reflects the population on all issues. In other words, if a fraction p 
of the population is of a certain position on a binary opinion, it is 
better (ceteris paribus) for a council to have, of that opinion, a 
fraction close to p than one far away from it.

Thus we could make a simulation. First, set that there are n binary 
issues. Each of the voters then have an issue profile which consists of 
n booleans. Set these randomly with different biases for each issue (so 
that, for instance, on the first issue, 70% may hold the "true" 
position, while on another, only 23% do).

Counting the proportion that hold the true-position for each issue gives 
the popular issue profile. In general, the issue profile of a certain 
subset takes the form of n numbers (for n issues), where each number is 
equal to the proportion that holds the true-position for the issue in 
question.

Then a perfectly representative assembly has an issue profile that is 
equal to the issue profile of the people. So now we have a measure of 
how well the assembly or council represents the people: the more its 
issue profile differs from that of the people, the less representative 
it is.

However, this presents a problem. How does one aggregate the difference 
on each issue into a single score? Is a one-percent difference on a 
single issue better than 1/n percent difference on all issues? One way 
to solve this is to just settle on an aggregation measure (like 
root-mean-square) and hope the results can be generalized across; 
another is to use Pareto-domination as a measure instead, in saying that 
councils produced by a method A is better than councils produced by a 
method B to the extent that A-councils lie strictly closer to the 
population profile than does B. That approach can give no information on 
the cases where some issues are closer by method A and some are closer 
by B (mutual nondomination).

Putting all of the pieces together, to figure out the scores, a 
simulation would do something like this:
	- Generate issue vectors for all of the people, and get the
		popular issue profile.
	- Choose a subset of the people as candidates.
	- Generate ballots for each voter of all the candidates.
	- For a great number of random assemblies:
		- Get the issue profile of this assembly, and calculate
			the similarity measure for that with regards to
			the popular issue profile.
		- If the similarity measure is more similar or less
			similar than any random assembly we've seen so
			far, update the worst (respectively best)
			record.

	- For each multiwinner election system:
		- Feed the ballots into the system.
		- Get the issue profile of the elected assembly, and
			calculate the similarity measure for that with
			regards to the popular issue profile.
		- Normalize the similarity measure with regards to the
			worst and best random councils.
		- Add the normalized similarity measure to that system's
			running	total.

To be robust, it would do this a lot of times with various population 
sizes, council sizes, and issue numbers (n). With a similarity measure, 
0 would be perfect (impossible most of the time), and 1 (or infinity, 
depending on the measure) be the worst possible.

The only thing remaining is to find out how to generate ballots for each 
voter. A reasonable assumption is that voters are going to prefer the 
candidates who agree with them on many issues to those that agree with 
them on a few. For binary issues, Hamming distance works: in the simple 
model, voters rank (or rate) the candidates inversely of Hamming distance.

--

I have made a program that does this. It is simple, does not use equal 
ranks (randomizing preferences instead), but the results are interesting.

Worst of the lot are the majoritarian systems ported to multiwinner 
systems. Those would, for a council of size k, just pick the k first in 
the social order of the single-winner method. This result shouldn't be 
surprising, because the straight port excludes minority opinion. Of some 
curiosity, however, is that IRV does the best among those; maybe it 
reflects IRV's origins as the multiwinner method STV? Or maybe noise (as 
resulting from nonmonotonicity and the likes) bring it closer to the 
results gained by just picking a random assembly.

Then come the vote-reweighted methods, like RRV. Vote-reweighted methods 
can be generalized as: run a single-winner method, then reweight those 
who voted for the winner, according to some function that does not take 
the number of seats into account. Then run again, and disregarding those 
that have already been elected, pick the next member as the one who is 
closest to the top in the social ordering output.

Best of all were the "proper" methods implemented: STV (with Senatorial 
rules) and QLTD-PR, which uses Woodall's QLTD instead of IRV as its 
basis: it adds fractional votes until someone gets above the quota, then 
reweights the voters who contributed to that one, basing the weighting 
on the candidate's surplus.

According to the RMSE scores:
	Majoritarian assemblies:
		Borda:         0.871528	*Plurality:       0.256192
		Antiplurality: 0.73616	Nauru-Borda:      0.599807
		IRV:           0.362097 Cardinal ratings: 0.894351

	Vote-reweighted assemblies:
		Borda:         0.376745	*Plurality:       0.260454
		Antiplurality: 0.401539	Nauru-Borda:      0.406815
		RRV (k = 1.0): 0.682116	RRV (k = 0.5):    0.644339

	Quota:
		*STV:          0.193959	*QLTD-PR:         0.121693
					QLTD-PR (rated):  0.417813

	Other:
		Random Cands:  0.364437

	STV-QLTD Pareto dominance: QLTD: 236, STV: 237, nondomin: 674

	"Plurality" is the weighted positional system of {1, 0, 0....}
	applied to ranked ballots.

	(* marks those that are better than a random assembly, on
	 average)

Some of the results may be due to artifacts in the voting pattern - the 
simulator was a proof of concept, after all. I think that Plurality 
benefits by that everyone votes sincerely, and that the ballots are 
complete, for instance. Yet patterns emerge.

If anyone wants to experiment with the simulation program, it is here: 
http://munsterhjelm.no/km/raw/pr_elect.zip . QLTD is called "Quota 
Bucklin" there, as I sort of independently discovered it while trying to 
make a quota-proportional form of Bucklin.

--

On a second thought, it shouldn't be so surprising that vote-reweighted 
methods, in general, do worse than quota-based ones. Consider the 
following situation:

	20: Left > Center > Right
	20: Right > Center > Left
	 1: Center > Left = Right

Condorcet would pick Center in the single-winner case. In the situation 
of an assembly of two, the reasonable choice (which CPO-STV picks) would 
be Left and Right.

However, vote-reweighted methods based on Condorcet would have to start 
off by picking Center, since all voters start off with equal weights. 
After it has done so, there is not enough room on the assembly to permit 
an even division of Left and Right, and thus either Left or Right will 
be favored, assuming Center supports both sides equally.

Vote-reweighted methods that aren't based on Condorcet may pick Left and 
Right, but they can only do so if they would pick either Left or Right 
in the single-winner case.