[EM] One scenario, many methods, by strategies in final poll

Fri Mar 18 10:33:19 PDT 2011

Hi,

I created a nice visual way to review what is happening in my simulation
with regard to strategy. I ran 1000 trials under each method of the 
exact same scenario: On a 1D spectrum, candidates ABC are at 
-.30, .24, .34 on a scale from -1 to +1. There are seven voting blocs
spread out evenly (so that the middle bloc is at 0). The blocs have the
same "base" size, but in any given poll only a random percentage from
40% to 100% of the voters in a given bloc will show up. This creates the
uncertainty.

With this layout, you have three blocs on the "left" (negative side)
with a strong preference for A over B, then C. The middle bloc has weak
preferences of B>A>C. The right three blocs weakly prefer C over B, but
both strongly over A. So respectively A>>B>C, B>A>C, C>B>>A you could
say.

For simplicity I am just grabbing the votes from the final poll. So you
can't directly tell how stable the votes were. I'll describe the 
strategies using a seven-letter string, one for each bloc. A "-" means
sincere. C means compromise. M compression. T truncation. B burial.
P pushover (voting worst top). I also used * for simultaneous compression
and pushover, but it was rare.

I should note that the results from this scenario may not be 
representative of all scenarios. I thought the results of this one were
a bit interesting, is all, and it takes time to review the differences
between scenarios.

First let's do Random Ballot. There were only five possibilities:
-------	962
----C--	14
-----C-	13
------C	10
----CC-	1

So you can see that overwhelmingly the final state involved all voters
being sincere.

The "Consensus or Else" method performs Random Ballot unless the first
preference winner has 60% of the vote, in which case he simply wins:
-------	497
----CCC	494
----C--	5
------C	2
-----C-	2

So, half of the time everyone was sincere, and half the time the C voters
compromised to give B a chance at the 60%. It was uncertain to me whether
this method would work at all, since in theory I suppose the voters are
supposed to know that if they can think of a compromise choice, they
should vote for it. But in practice, in the sim, the blocs can't "talk"
to each other or anything.

FPP:
---C---	500
----CCC	499
CCC----	1

Only three possibilities. Surprisingly it is about even whether the B
voters or C voters used favorite betrayal. In one single case the A voters
did it.

My disqualification plurality (VDP/VFA):
----CCC	65
BBBBCCC	16
BBB-CCC	15
TBT-CCC	11
B--BCCC	11
BBBTCCC	11
--B-CCC	11
TBTBCCC	11
...

The C voters compromise all the time. It is only a touch more sincere-
Condorcet-efficient than FPP.

Top-two runoff:
----CCC	439
-----CC	191
----C-C	187
----CC-	177
---PCCC	5
---C---	1

Still a lot of compromise.

In contrast, the VFA ballot runoff:
BBBBCC-	109
BBBBC-C	100
BBBB-CC	79
BBB-CC-	14
BBBB---	10
BBBB-C-	10
...

In other words the A voters reliably voted "against" B, B voters usually
voted against A, and C voters frequently gave their "for" vote to B,
voting (sincerely) "against" A.

Results? TTR elected B 93.4% of the time and VFAR only 88.3%: TTR's
sincere Condorcet efficiency was 93.5% to VFAR's 98.7%.

Now IRV. The full list is quite long due to the A voters being unable
to determine how best to use their second preference.
----CCC	88
----CC-	44
-----CC	39
----C-C	36
--T-CCC	34
-T--CCC	23
--T-C-C	22
--T-CC-	22
T---CCC	19
T---CC-	17
...

If we go lower the A voters had some burial as well. It is nice to see
that sincerity beat truncation and truncation beat burial though. The
C voters were typically voting for B instead of C, while the B voters
presumably saw no reason to ever compromise.

QR's compromise disappointed me:
----CC-	69
----C-C	69
-----CC	64
----CCC	56
---BCCC	30
T---CC-	22
--T-C-C	19
...

The A voters were quite confused as in IRV, though you can't quite see it
from the top of the list here.

If we try Condorcet//IRV:
TTT-C-C	26
TTT-CC-	25
TTT--CC	25
TTB-CC-	21
BTT-CCC	19
TTB--CC	18
BTT-CC-	18
TBT-CC-	18
TBT-CCC	18
...

I'd have to say I don't think this is too promising. The A voters are
never sincere and the C voters are compromising a lot.

Here's the IRNR method:
BBB-CCC	165
BBT-CCC	41
TBB-CCC	41
BTB-CCC	38
---T---	30
TBB-CMM	21
BTB-CMM	16
...

There are a few of these BBB-CCC methods. A pretty ugly situation, 
although electing B most of the time gave IRNR decent Condorcet 
efficiency.

Let's look at the King of the Hill (KH) method:
---T---	324
---TCCC	26
----CCC	16
-T--CC-	15
----CC-	15
...

Almost a third of the trials ended with no strategy except bullet voting
by the B voters. If we continue down the list, though, the compromise
by various C blocs continues.

If we stick Condorcet on the front of KH (C//KH):
---T---	161
TTT-CC-	25
TTB-CCC	20
BTT-CCC	19
TTT-CCC	19
TTT--CC	17
TTT-C-C	16
...

I kind of think it looks worse.

DSC is a bit interesting... I am including DSC with no ER allowed (top),
and DSC where it is allowed (second). So the only difference between
these methods is whether tying at the top (compression) is possible.
While the methods look a little different, few in the end are actually
using any compression. Puzzling.
-------	392
---BCCC	17
----CCC	16
BBB-CCC	12
--B-CCC	11
--TBCCC	10
... (DSC no ER)

-------	509
----CCC	38
---M---	30
---BCCC	26
B---CCC	13
--BC---	12
--B-CCC	12
BBB-CCC	12
... (DSC ER)

The ER version looks better to me (and also has better (though still
quite bad) sincere CW efficiency... something which incidentally you can't
depend on in this scenario), but why should it be?

Next let's do DAC and transition into the Bucklinesque methods:
TTTT---	584
TT-T---	60
-TTT---	53
T-TT---	47
TTTTCCC	29
TTT-CCC	22
T--T---	22
...

So, truncation heavy (A and B blocs) with touches of compromise.

As I write this I realize that I allowed equal-ranking but didn't allow
tied at the top preferences to be counted as FPs. It's likely that
DAC is actually a compression-heavy method, if I fixed this.

Bucklin itself (no ER):
TTTT---	702
-TTT---	40
T-TT---	35
TT-T---	31
TTT-CCC	31
TTTTCCC	30

Similar but less varied. DAC's sincere CW efficiency was a touch better.

Now QLTD, Woodall's Bucklin variant:
TTTT---	518
TTT-CCC	65
TTTTCCC	56
T-TT---	53
-TTT---	50
TT-T---	41
...
The compromise makes it look worse than Bucklin, and also the "SCWE"
(shorter than "sincere CW efficiency") was worse.

Now MCA or ER-Bucklin(whole):
TTTTM--	214
TTTT---	209
TTTT--M	144
TTTT-M-	119
TTT-MMM	55
TT-TM--	16
...

Generally much bullet-voting from A and B voters and some compression
from C (especially if the B voters vote B>A).

My Bucklin variant (VBV) with and without equal rankings:
---T---	460
TTT----	208
T-T----	36
-TT----	36
TT-----	28
-T-----	21
...   (no equal rankings)
---T---	546
BTT-CCC	25
TTB-CMM	22
TTB-CCC	22
TBT-CCC	22
BTT-CMM	17
---TMMM	16
...   (equal rankings allowed)

It is interesting how allowing equal rankings creates not just compression
but also burial and outright compromise. Despite this, the SCWE of the
ER version was better, at 92.7% vs 80.8%! (The strict version was less 
capable of electing B.)

Conditional Approval (CdlA):
-------	162
---B---	80
TTT-CMM	70
TTT-MMC	51
TBT-CCC	48
TTB-CCC	44
BTT-CCC	40
TTT-MCM	39
...

I like that fairly large chunk of sincerity or near-sincerity. On that
second line you can see the B voters are trying to give C some more
votes, to force A voters to give up their B preferences. I wonder if
technically that is an example of "pushover."

Chris' SMDTR:
---TM--	110
---T--M	67
----MMM	63
---T-M-	61
-T--MMM	43
-TT-MMM	42
...

Fair amount of compression. Middling SCWE.

Chris' IBIFA (original definition):
TTT-MMM	592
---TTTT	39
TTB-MMM	36
BTT-MMM	34
TBT-MMM	31
----M--	25
TTT----	19
...

Looks like not that many lower preference slots are getting used. SCWE
was slightly worse than SMDTR.

Antiplurality:
-------	276
--B----	235
-B-----	231
B------	217
---B---	32
...

Not a lot of different scenarios. B wins virtually all the time, which
gives mediocre SCWE despite all the sincerity.

MAP (Majority Favorite//Antiplurality):
-------	220
B------	204
--B----	179
-B-----	175
---B---	45
-----CC	11
...
Rather similar sincerity, and SCWE jumps from 87.1% to 97.4%.

Coombs:
-------	540
BBBC---	238
---B---	154
BBBC-C-	14
BBBC--C	13
...

Coombs has a lot more burial, and SCWE of only 74.1%, being relatively
unable to elect B.

Borda (full strict rankings):
BBB-CCC	662
-B--C--	78
--B-C--	63
B---C--	52
B-B-C--	21
BBBBCCC	19
-BB-C--	17
...

Basically at least 2/3rds of the time you have the A side burying and
the C side compromising. The Condorcet efficiency was pretty poor, at
76.6%.

I checked that Baldwin and Black are similar but a bit milder.

Now for Approval and Range:
TTTTMMM	862
TTTMTTT	48
TTTTMMC	28
TTTTMCM	27
TTTTCMM	23
TMMTMMM	5
TTTTCMC	4
TTTTMCC	2
TTTTCCM	1
(Approval, whole list)
TTTTMMM	812
TTTTMMC	45
TTTTMCM	44
TTTTCMM	43
TTTBMMM	11
TTTTMCC	9
... (Range)

There are some illogical strategies in there. But under both the most
common scenario by far was that the race was won by B (more likely) or
A, with C having no odds at all. That's pretty consistent with my past
impressions. SCWE was 89.7% Approval, 92.4% Range.

Now for some pure pairwise methods. Minmax with Winning Votes:
--B-MMM	213
B---MMM	197
-B--MMM	174
----MMM	126
BBBM---	82
---B---	68
---BT--	15
...

A lot of compression from C voters, and some burial here and there.

If we try, say, Margins, we get:
BBB-CCC	603
---B---	79
BBBC---	44
TBB-CCC	26
BBT-CCC	24
BTB-CCC	19
TBBC---	15
...

It looks worse to me, but the SCWE is still 81% to WV's 87%.

I'll whip through a few of these:
MMPO:
BBB-MMM	341
B-B-MMM	44
BBBM---	43
BBB-CMM	42
...  ^--- so, like WV but with more burial!
WV with no ER allowed:
BBB-CCC	360
BBBC---	175
---B---	61
...  ^--- marginsesque
margins with no ER allowed
BBB-CCC	356
---B---	69
BBBC---	51
...  ^--- same
minmax with no ER or truncation allowed
BBB-CCC	376
BBBC---	172
---B---	84
...  ^--- same

Interestingly MMPO was the best of these with SCWE of 90.4%. But Raynaud
was even better at 93.8%:
----MMM	150
----M--	119
----MM-	119
----M-M	107
--B-MMM	92
...

Where are all the buriers under Raynaud? I thought this method would
be similar to MMPO.

Forest's TACC:
TTT-CCM	42
TTT-CCC	39
TTT-MMC	38
TTT-CMM	34
TTT-MCM	33
TTT-MMM	32
TTT-CMC	31
...

So, a lot of truncation, and some compromise, but it remains true as I
reported before that TACC has very few burial attempts.

DMC:
-------	298
BBB-CCC	148
---B---	112
BBBC---	52
--T----	49
-T-----	42

I like the top chunk, not so much the second one. SCWE was a bit lame,
lower than WV and Antiplurality but higher than margins.

Cardinal-Weighted Pairwise:
----MM-	241
----M-M	233
----M--	168
---T---	121
-----MM	60
...

Approval-Weighted Pairwise (implicit approval):
-------	377
---B---	91
-T-----	75
--T----	72
T------	66
...

Approval-Weighted Pairwise (explicit approval):
-------	243
B------	61
T------	57
-B-----	57
--T----	56
...

Quite impressive sincerity with these, and at the same time these were
the two top methods wrt SCWE (98.8% explicit, 99.3% implicit).

Condorcet//Approval (implicit):
-------	138
---T---	68
TBT-CCC	66
TTT-CMM	65
TTT-CCC	63
...

and the explicit version:
-------	173
---B---	116
--T----	70
T------	69
-T-----	63
...

I do think the explicit version is junk (even here the burial rate was 
double), but here the SCWE was fourth place for explicit... I will have
to hypothesize that there was just no great opportunity for burial here.

ICA:
TTT-MMM	343
TTT-MCM	76
TTT-CMM	61
TTT-MMC	57
---T---	31

Rather disappointing. It's interesting and puzzling to me that modifying
a method (C//A implicit) so that it satisfies FBC would create, say,
compression incentive where there had not been compromise incentive,
and furthermore truncation incentive that hadn't been there and doesn't
even seem related.

MDDA:
TTTT---	225
TTT-MMM	108
T-TT---	42
TTT-M--	37
TT-T---	34
...

A lot of truncation, some compression.

MAMPO:
BBB-MMM	258
BBB-CMM	49
BBBM---	46
BBB-MCM	37
BBB-MMC	36
...

Wow, look at all that burial, for a method that is supposed to be a
fairly small alteration to Approval!

I've probably said this before, but MDDA and MAMPO were both designed
to satisfy three of Mike Ossipoff's criteria (FBC, SDSC, and SFC). They
both have heavy approval components. But it seems they aren't 
interchangeable.

---

To conclude this long mail I'll just give you the SCWE ranking I've
been referring to.

99+%: AWP implicit
98+%: AWP explicit, VFA runoff, C//A explicit
97+%: MCA, MAP
95+%: DAC, Bucklin, CdlA
93+%: QLTD, KH, CWP, C//KH, Raynaud, IRNR, TACC, MDDA, TTR, C//IRV,
QR, SMDTR, C//A implicit, IRV
89+%: ICA, VBV (ER), Range, IBIFA, MAMPO, MMPO, Approval
85+%: WV, Antiplurality, DMC
74+%: margins, VBV (strict), margins (no ER), Borda, WV (no ER), minmax
(full rankings only), Coombs
50's%: DSC (ER), DSC, VDP/VFA, FPP
40%: "Consensus or Else"
20%: Random Ballot

If you want more stats or a certain scenario, let me know.

Thanks.

Kevin Venzke