[EM] Manipulability stats for more poll methods (fixed footnotes)

Fri May 3 13:18:44 PDT 2024

Oops, I numbered my footnotes incorrectly. Let's do that again. (Ignore 
my last post.)

Here are voter manipulability stats for most of the cardinal and 
cardinal hybrid methods. The exceptions are:
	- MJ, because I'm not confident enough about how I implemented it (in 
particular, what tiebreaker it uses),
	- Margins-Sorted Approval, because I'm not sure how it works, and
	- Approval with manual runoff, because it's difficult to model the 
effects of further discussion between the rounds.

Same testing parameters as the other stats: spatial (Gaussian) model, 4 
dimensions, 4 candidates, 99 voters, 500k elections tested, and 32k 
strategy attempts per election.

As in the last post, I've marked the non-poll methods with an asterisk.

The manipulability values are:

0.937	*Range(0-5, absolute scale)
0.928	Approval (absolute scale)[1]
0.710	*Range (0-10, normalized)
0.708	*Range(0-5, normalized)
0.705	Smith//Range(0-5, absolute scale)
0.666	Approval (mean utility cutoff)
0.655	Smith//Range(0-10, absolute scale)
0.645	STAR
0.564	Smith//Range (0-5, normalized)
0.557	Smith//Range (0-10, normalized)
0.514	Smith//Approval (explicit, mean utility cutoff)
0.490	Smith//Approval (implicit, mean utility cutoff)

0.443	Smith//DAC (mean utility truncation)[2]

Some values from my last post for reference:

0.480	Copeland//Borda (Ranked Robin)
0.417	Plurality
0.333	Schulze, minmax
0.074	Condorcet-IRV

and for verification, James Green-Armytage's results are:[3]

0.710	Range (normalized)
0.668	Approval (mean utility cutoff)

Range is not part of the poll, but it serves to show the differences 
between absolute and relative (normalized) scales, and to show that my 
results are similar to JGA's.

"Absolute scale" gives the voters a common scale to rate on, to model 
the Range component passing IIA. The voters' utilities in this model are 
maximum if the candidate is at the same point in opinion space as they 
are, and minimum at a utility that the spatial model (with random 
candidates and voters) would exceed 90% of the time. Since this 90% 
quantile doesn't depend on the candidates who were selected, it's an 
absolute scale, and 10% of the voter-candidate judgements would be 
clamped to zero, on average.

On the other hand, "normalized" has the voters rate their least favorite 
zero and their favorite to maximum.

"Mean utility cutoff" is the (relative scale) Approval guideline where 
the voter approves every candidate above mean utility and disapproves 
everybody else. Though a relative scale, it's not quite the same thing 
as "normalized".

STAR uses a scale of 0-5 inclusive. Since the official STAR ballot text 
tells the voters to normalize,[4] I've only included the normalized 
manipulability value.

For most of the other cardinal methods and their hybrids, I've given 
both 0-5 and 0-10 ballot formats. The 11-slot ballot makes it easier to 
show a difference of preference, which helps identify the honest Smith 
set in Smith//Range. However, there's not otherwise much of a difference.

-km

[1] Absolute scale approval has a high tie rate of 5%, so it's possible 
that it should "really" be worse than Range. My simulator deliberately 
only checks elections with unique honest winners.

[2] The detailed stats suggest that pushover is a problem with 
Smith//DAC. However, getting a per-strategy breakdown for cardinal 
methods is hard due to limitations of my simulator, so it would still 
have to be verified by other means. The "mean utility truncation" is 
what makes it cardinal in my simulator's eyes.

[3] Green-Armytage, James (2011). "Four Condorcet-Hare hybrid methods 
for single-winner elections". Voting matters (29): p. 7; 
https://www.votingmatters.org.uk/ISSUE29/I29P1.pdf

[4] https://www.starvoting.org/paper_ballots Step 3.