[EM] Simulations on method similarity

Tue May 4 22:51:46 PDT 2010

Hello,

I wrote a simulation awhile ago that can compare methods' similarity
in three-candidate elections. I was playing with it this evening,
putting together some numbers, unfortunately not leaving me with much
time to write a post on it.

I included 15 methods this time, called:
fpp, appr, irv, mmpo, schwv, schm, ca, buck, pos, dsc, dac, vfa, map,
spst, cdla.

Most should be clear. Schwv is WV, schm margins, ca Condorcet//Approval,
buck Bucklin, pos "positional" (middle ranking is worth half a point, top
ranking 1, truncation zero), map Majority Favorite//Antiplurality (i.e. if
there's a majority favorite, elect him; otherwise elect the candidate with
the fewest strict last place rankings), cdla Conditional Approval (which
I was discussing recently in a post to Forest).

"Approval" means any candidate not truncated receives a vote.

There are 9 ballot types supported: A AB AC B BA BC C CA CB. Equal
ranking is not supported, only truncation. In general each ballot type
occurs with equal frequency, but these are modified by the six
scenarios.

To define the six scenarios I tested, I have to define a few terms:

Wrap: If there's wrap, people aren't necessarily voting on a 1D spectrum,
so that AC and CA ballots are allowed. If there's no wrap, the AC and
CA ballots are burned with no change to any other quantities.

Truncation: If no truncation, the A B and C ballots are burned with no
change to any other quantities.

Spread: If there's spread, the B BA and BC ballot quantities are divided
by three to represent center squeeze.

By toggling these we get eight possible scenarios, but it doesn't make
much sense to have wrap and spread at the same time, so I only did six.

1: Wrap, trunc, no spread. All 9 ballot types
2: No wrap, trunc, no spread. A AB BA B BC CB C
3: No wrap, no trunc, no spread. AB BA BC CB
4: No wrap, no trunc, spread. AB BA BC CB; B are shrunk
5: No wrap, trunc, spread. A AB BA B BC CB C; B are shrunk
6: Wrap, no trunc, no spread. AB AC BA BC CA CB

The simulation finds how often each method agrees with every other
method (but discarding scenarios where there is no disagreement), keeps
track of patterns (e.g. does it ever happen that IRV and WV agree against
DSC and Bucklin), and it also tracks how often each method violates
Condorcet, Minimal Defense, and Plurality, or elects a candidate with
a voted majority against him (which all methods will do sometimes, though
not necessarily in all the scenarios).

I only ran 50,000 trials of each scenario.

I could dump the raw data here, and may still, but maybe I'll try to
say something interesting first.

One interesting question is, if we were to assume one method were the
best and measure other methods against it, how would they rate, in the
given scenarios? I grabbed numbers on this for WV and for the positional
analysis.

Here are WV's similarities for the six scenarios (in order). Numbers
are the rate of agreement in scenarios where there was any disagreement
between any of the methods.

Then for ease of reading I grouped the methods into percentage ranges.

The order of the numbers in the top row is:
fpp appr irv mmpo schwv schm ca buck pos dsc dac vfa map spst cdla

schwv 56  56.4  88.8  95.7  0  95.2  91.9  58.7  73.6  57.1  64.4  56.1  43.8
 57.5  80.4
90s: mmpo schm ca
80s: irv cdla
70s: pos
60s: dac
50s: fpp appr buck dsc vfa spst
40s: map

schwv 39.4  64.9  81  95  0  98.2  99.3  72.9  79.3  42.9  72.5  39.6  71  44.6
 95.6
90s: mmpo schm ca cdla
80s: irv
70s: buck pos dac map
60s: appr
40s: dsc spst
30s: fpp vfa

schwv 33.2  66.7  74.8  100  0  100  100  100  83.4  66.7  66.7  41.6  100
 66.7  100
100: mmpo schm ca buck map cdla
80s: pos
70s: irv
60s: appr dsc dac spst
40s: vfa
30s: fpp

schwv 53  46.9  67.7  100  0  100  100  100  81.1  76.4  76.4  54.1  100  76.4
 100
100: mmpo schm ca buck map cdla
80s: pos
70s: dsc dac spst
60s: irv
50s: fpp vfa
40s: appr

schwv 70.8  42.1  81.6  84.7  0  96  98.9  64.4  82.2  71.9  72.8  70.8  46.6
 72.8  89.3
90s: schm ca
80s: irv mmpo pos cdla
70s: fpp dsc dac vfa spst
60s: buck
40s: appr map

schwv 46.7  46.7  82.1  100  0  100  91  60.5  75.5  64.6  64.6  48.8  60.5
 65.7  73
100: mmpo schm
90s: ca
80s: irv
70s: pos cdla
60s: buck dsc dac map spst
40s: fpp appr vfa

Something I've noticed before here is that usually IRV is more "WV-
efficient" than DSC, except in scenario 4. There IRV tends to eliminate
the centrist while DSC is more likely to save him. But in scenario 5
(the other "spread" scenario) IRV prevails again over DSC presumably
because DSC performs badly with truncation (quickly degrading to FPP
with lower preferences useless even to those expressing them).

Now here are the numbers for the positional analysis:
(Again the order is:
fpp appr irv mmpo schwv schm ca buck pos dsc dac vfa map spst cdla)

pos 63.2  64.2  72  71.3  73.6  76.6  76.5  66.3  0  64.2  75.7  63.2  37.8
 64.4  73.2
70s: irv mmpo schwv schm ca dac cdla
60s: fpp appr buck dsc vfa spst
30s: map

pos 41.9  67.8  71.7  76.6  79.3  79.8  79.5  73.8  0  45.5  75.1  42.2  62.9
 46.8  78.2
70s: irv mmpo schwv schm ca buck dac cdla
60s: appr map
40s: fpp dsc vfa spst

pos 16.7  83.2  58.3  83.4  83.4  83.4  83.4  83.4  0  50.2  50.2  25.1  83.4
 50.2  83.4
80s: appr mmpo schwv schm ca buck map cdla
50s: irv dsc dac spst
20s: vfa
10s: fpp

pos 34.2  65.7  48.9  81.1  81.1  81.1  81.1  81.1  0  57.6  57.6  35.3  81.1
 57.6  81.1
80s: mmpo schwv schm ca buck map cdla
60s: appr
50s: dsc dac spst
40s: irv
30s: fpp vfa

pos 79  37.5  81.5  71.6  82.2  84.2  82.7  58.4  0  79.6  68.2  79  38.6
 80.6  76.1
80s: irv schwv schm ca spst
70s: fpp mmpo dsc vfa cdla
60s: dac
50s: buck
30s: appr map

pos 51.2  51.6  65.4  75.5  75.5  75.5  74.1  63.4  0  69.5  69.5  53.4  63.4
 69  68
70s: mmpo schwv schm ca
60s: irv buck dsc dac map spst cdla
50s: fpp appr vfa

I'll also rank (for now) methods in terms of how often they elect a
candidate with a majority against him, in each scenario. Better methods
(by this measure) first.

Scen 1:
schwv=mmpo schm irv ca cdla pos dac buck spst dsc appr vfa fpp map
Scen 2:
schwv=mmpo=map cdla=ca schm buck pos dac irv appr spst dsc vfa fpp
(first three were perfect in this scenario)
Scen 3:
buck=ca=schm=schwv=mmpo=cdla=map pos irv spst=dac=dsc=appr vfa fpp
(first bunch are perfect)
Scen 4:
buck=ca=schm=schwv=mmpo=cdla=map pos spst=dac=dsc irv vfa fpp appr
(first bunch are perfect)
Scen 5:
schwv=mmpo=map cdla ca buck schm dac pos irv spst dsc vfa fpp appr
(first bunch are perfect)
Scen 6:
ca=schm=schwv=mmpo irv cdla pos spst dac=dsc map=buck vfa appr fpp

The Condorcet compliance rankings are actually pretty similar to these.
The main exception that stands out to me is that in a couple of
scenarios, Bucklin and MAP are very good at not electing candidates 
with a majority against them, but are relatively frequent Condorcet
violators.

That's all for now...

Kevin Venzke