[EM] Some new utility simulations

Kevin Venzke stepjak at yahoo.fr
Sun Jun 6 21:08:59 PDT 2010


Hello,

I've been working on a 3-candidate simulation on 1D or 2D issue space
that uses pre-election approval polls to inform truncation strategy in
rank methods.

The basic setup is the three candidates are bolted in place (i.e. only
one scenario is examined), and then there are 10k approval polls followed
by 10k elections. In all of these, groups of voters are distributed
randomly in a circle around the origin (in the 2D case) for each trial.

The approval polls are a tricky thing. The precision of the polls probably
has a significant effect on the eventual results from the elections.
Precision is affected by number of voting blocs, number of trials, 
frequency of "reboots" in the trials. Reboots proved necessary because
the polling can fall into a rut that was not inevitable, and I wanted
to get the same outcome for the same setup every time.

When the polls are done it moves to the elections.

Here are the methods (roughly grouped) that are compared at the moment:
A. Approval: Using zero-info above-mean strategy (ApprZIS) and using the
results from the polling (ApprPoll).
B. Range: Normalized sincere (RangeNS).
C. Sincere, strictly ranked methods using no strategy:
FPP, MinMax, IRV, DSC, QR, VFA, SPST.
D. Methods using truncation strategy based on the approval polls:
MinMax(wv), MinMax(margins), MMPO, C//A, Conditional Approval (CdlA),
DAC, Bucklin.
E. LNHarm methods also using truncation:
IRV-tr, DSC-tr, QR-tr.

I have implemented a couple of other methods but omitted them either
because they were worse than expected (Raynaud, 2-slot MMPO) or it was
too unclear that they would be voted sincerely (Maj//Antiplurality).

I'm not exactly sure what results to gather from this sim. It's clear
that the choice of scenario has a great effect on the quality of the
various methods, but I can't think of a comprehensive way of evaluating
the situation in general.

Here's a basic 1D scenario with candidates at -50, 0, and 50. (The voters
are placed at +-100 on each axis available, not to exceed Euclidean
distance of 100 from the origin.)

BEST 1  0                    48.16706  0
Bucklin .8856  .0007         48.45712  1.2
MMstrict .8852  .0007        48.45847  1.2
CdlA .8852  .0007            48.45847  1.2
DAC .8851  .0007             48.45937  1.2
RangeNS .8285  0             48.73432  2.4
ApprZIS .7666  0             49.30137  4.9
ApprPoll .7787  0            49.31736  5
DSC .7036  .0034             49.48676  5.7
C//A .71  .0209              49.73658  6.8
MMWV .7097  .0209            49.7375  6.8
QR .6617  .0145              50.05157  8.2
MMmarg .6546  .0539          50.26441  9.1
SPST .5443  .0154            50.91098  11.9
IRV .5622  .0769             51.27729  13.5
IRV-tr .5231  .0515          51.33794  13.8
QR-tr .4975  .0547           51.59578  14.9
MMPO .5885  .1356            51.64151  15.1
VFA .4448  .0778             52.13669  17.3
DSC-tr .4198  .0571          52.22204  17.7
FPP .4192  .0579             52.23298  17.7
WORST 0  1                   71.07445  100

"BEST" and "WORST" are what you get if you consistently elect the best
and worst candidate in terms of utility (average distance from voters,
lower being better).

So each method has four values after it: How often the method elected
the best candidate, how often the worst, the average distance of the
winner (the utility), and a normalization of this so that BEST is 0 and
WORST is 100.

The candidates were fairly evenly distributed. If we want some center 
squeeze we could try -10 0 10:

BEST 1  0                    48.37085  0
DAC .9499  .0072             48.38398  .3
Bucklin .9444  .007          48.3858  .3
MMstrict .9406  .0052        48.38713  .3
CdlA .9406  .0052            48.38713  .3
DSC .9292  .0068             48.39459  .5
QR .9079  .0086              48.41177  .9
SPST .9075  .0086            48.41229  .9
RangeNS .8834  0             48.41618  1
C//A .9021  .0178            48.41858  1.1
MMWV .8963  .0178            48.42112  1.2
QR-tr .8592  .0222           48.44625  1.8
FPP .8505  .0243             48.4507  1.9
IRV-tr .8507  .0308          48.45502  2
DSC-tr .8505  .0291          48.45574  2
MMmarg .8516  .0445          48.46732  2.3
MMPO .8672  .0493            48.47064  2.4
IRV .8512  .0496             48.47273  2.4
VFA .8508  .0496             48.47325  2.4
ApprZIS .7799  .0023         48.51882  3.5
ApprPoll .6096  .0025        49.0423  16.1
WORST 0  1                   52.52236  100

One thing I noticed is that Approval and Range tended to be good at not
electing the worst candidate. But they weren't always very good at 
electing the best candidate.

Also, there may be something to Abd's thought that first-preferences
are a helpful indicator in finding the best winner. Notice that not
only did FPP beat IRV, but IRV with truncation beat IRV, which runs a
bit contrary to the typical thought that more information is better.
You can see that of these three FPP was least likely to pick the worst
candidate, and fully-ranked IRV was the most likely. (They were about
the same at picking the best candidate.)

Something to note is that while you might think FPP must be an 
increasingly bad method as the center squeeze situation gets worse, it's
also true that the more severe center squeeze you have, the more uncertain
it is that the center candidate actually has the median voter's support.

Let's move the center voter to 8 to be a near-clone of the candidate at
10:

BEST 1  0                    48.35805  0
MMstrict .9452  .0196        48.38671  .6
CdlA .9452  .0196            48.38671  .6
DSC .8889  .0204             48.3902  .7
QR .8842  .0224              48.39247  .8
SPST .8836  .0224            48.39254  .8
Bucklin .9041  .0308         48.3965  .9
DAC .872  .0308              48.40057  1
RangeNS .8365  .01           48.40567  1.1
DSC-tr .8322  .0448          48.41082  1.2
QR-tr .8348  .0455           48.41165  1.2
IRV-tr .8323  .046           48.41252  1.3
MMmarg .8298  .0463          48.4129  1.3
C//A .8304  .0423            48.41427  1.3
MMWV .8288  .0421            48.41595  1.4
IRV .8307  .0512             48.418  1.4
VFA .8301  .0512             48.41806  1.4
MMPO .8292  .0482            48.41813  1.4
FPP .8471  .0648             48.43528  1.8
ApprZIS .6941  .0242         48.46894  2.6
ApprPoll .6971  .0283        48.46909  2.6
WORST 0  1                   52.4845  100

Happy to see QR doing pretty well here. Bucklin and DAC have been somewhat
dethroned. FPP is back to being bad (which makes sense due to the vote-
splitting) and Approval is still having problems electing the best 
candidate.

If we try to confuse the methods by centering the candidates oddly we can
get an interesting result. Place the candidates at -50, -30, -20:

BEST 1  0                    51.8419  0
DAC .9679  .0041             51.85693  .1
Bucklin .9674  .004          51.8572  .1
MMstrict .9647  .0036        51.8577  .1
CdlA .9647  .0036            51.8577  .1
DSC .9344  .005              51.88032  .3
QR .9193  .007               51.89808  .5
SPST .9175  .007             51.90079  .5
C//A .9357  .023             51.9166  .6
MMWV .933  .023              51.91767  .6
ApprPoll .9407  .0016        51.93407  .8
QR-tr .8892  .0216           51.94456  .9
FPP .8864  .021              51.94644  .9
IRV-tr .8884  .0233          51.94926  .9
IRV .8884  .0243             51.95284  .9
DSC-tr .8864  .0245          51.95435  1
VFA .8866  .0243             51.95555  1
RangeNS .8401  .0002         52.00223  1.4
MMmarg .8921  .0486          52.00734  1.4
MMPO .8819  .0815            52.15816  2.8
ApprZIS .2495  .0007         54.21854  21.4
WORST 0  1                   62.9444  100

ApprZIS is predictably much worse than ApprPoll. The far left candidate
is not very viable but there is no way to know this. RangeNS also suffers
due to this. Other methods cope about the same as before since they are
informed by the polls or are not approval-based.

There are outcomes very different from these especially if we move to 2D.
Let's do that and try placing candidates at -10,0 10,5 10,-5, so that
we have two near-clones on the right who differ slightly on one axis.

BEST 1  0                    65.42126  0
RangeNS .8353  .0276         65.60041  2.7
DSC .8014  .0512             65.70736  4.3
MMstrict .8014  .0441        65.72435  4.6
MMmarg .7879  .045           65.72974  4.7
C//A .7815  .0478            65.74102  4.9
ApprZIS .7527  .0423         65.75481  5.1
DSC-tr .7849  .0749          65.76045  5.2
QR .7912  .0556              65.76487  5.2
IRV-tr .7755  .0562          65.7662  5.2
MMWV .7739  .053             65.76801  5.3
DAC .758  .0597              65.77155  5.3
IRV .7733  .0604             65.79535  5.7
Bucklin .7381  .057          65.80159  5.8
QR-tr .7711  .0726           65.80724  5.9
SPST .782  .0875             65.81032  5.9
MMPO .7542  .058             65.81198  5.9
ApprPoll .7284  .0521        65.81401  6
VFA .7689  .0882             65.82866  6.2
CdlA .7612  .0649            65.83314  6.3
FPP .705  .188               66.30692  13.5
WORST 0  1                   71.94286  100

RangeNS is now the best method and DSC is second! DAC, Bucklin, and 
especially CdlA have fallen quite a ways relatively.

Here's another odd one: Put the candidates in a large triangle (as though
they're trying to avoid any voter's actual position) as -100,-100
100,-100 0,100:

BEST 1  0                    112.2975  0
RangeNS .8921  .0084         113.244  2
ApprZIS .8772  .0135         113.5693  2.7
ApprPoll .8801  .0279        113.6294  2.9
Bucklin .8762  .0272         113.6771  3
DAC .8719  .0271             113.7457  3.1
DSC .8609  .0166             113.9418  3.6
SPST .8537  .0213            114.147  4
VFA .8538  .0213             114.1503  4
FPP .8537  .0216             114.1822  4.1
MMstrict .8499  .0201        114.1887  4.1
DSC-tr .8489  .0236          114.2393  4.2
IRV .8478  .0237             114.2898  4.3
QR .8459  .028               114.3784  4.5
QR-tr .831  .0264            114.6033  5
CdlA .8276  .036             114.7558  5.3
IRV-tr .8113  .031           114.9541  5.8
C//A .807  .0329             115.0151  5.9
MMmarg .8008  .0344          115.1501  6.2
MMWV .7893  .0439            115.4359  6.8
MMPO .778  .0644             115.921  7.9
WORST 0  1                   157.8483  100

The Condorcet methods with truncation are the worst, and strictly-ranked
MinMax is actually beaten by FPP. The situation is odd enough that I
may look into it further.

One thing that strikes me so far is that none of the methods really seem
to be that bad! Of course that's a subjective judgment and depends a
bit on what scenarios you're interested in. But if one method has a
normalized score (far right value) of 4% and another is 2%, it's debatable
that you can call the former "twice as bad." The absolute amount of
difference probably very small, if you look at the third column instead.
For the most part the methods seemed to be fairly sensitive.

Kevin Venzke



      



More information about the Election-Methods mailing list