[EM] Empirical voting experiment: first numbers

Kristofer Munsterhjelm km_elmet at t-online.de
Mon Aug 25 13:44:31 PDT 2014


On 08/22/2014 01:05 AM, Jameson Quinn wrote:
> As many of you know, I've been running an online voting experiment using
> human subjects from Amazon Mechanical Turk. I'm using a 3-candidate,
> 9-voter "chicken dilemma" scenario, with factions of 4, 2, and 3 voters:
> Cand:        X  Y  Z
> Faction:
> Red   4      3  1  0
> Green 2      0  3  2
> Blue  3      0  2  3
>      Size     payoffs
>
> Each group of 9 gets assigned factions and a voting method, and runs the
> election 3 times, with monetary payoffs proportional to the numbers
> above in the last two rounds. Then they answer a survey about how fair,
> easy to vote, and easy to understand they found the method, plus some
> demographic questions.
>
> The voting systems I have tested so far include approval, Borda,
> Condorcet (minimax), IRV, MAV (medians), plurality, and score. I plan to
> also test SODA very soon.
>
> My analysis of the outcomes and strategies is not yet ready to share
> here. However, I have some numbers on the survey results. I used a
> Kruskall-Wallis comparison test, appropriate for Likert-scale results
> like these. Here are the results for the question "How easy was it to
> *understand* {{methName}} (the voting system you used)?":
>
>
>          trt    means   M
> 1 approval  231.2388   a
> 2 borda     229.7286   a
> 3 score     205.4898  ab
> 4 MAV       195.5761 abc
> 5 plurality 172.9545  bc
> 6 condorcet 165.3897   c
> 7 IRV       118.8281   d
>
> The important thing about the above table are the letters at the end. If
> two systems share at least one letter in common, the differences between
> those systems are not statistically significant. So we can safely say,
> for instance, that Approval and Borda are easier to understand than
> Condorcet, but we can't tell whether MAV is as understandable as the
> former or as confusing as the latter.

Here I am surprised that Condorcet was considered more easy to
understand than IRV. IRV advocates often say that the "remove the loser
from the ballots and run again until someone gets a majority" is a very
simple phrasing, and it certainly seems simpler than explaining Minmax.
Did you explain the actual Minmax method or just Condorcet (the
candidate that would beat every other candidate one-on-one wins)?

If you did explain Minmax itself, I am indeed surprised. I'm not going
to complain, though! If the results are representative, that would be a
serious counter to the "IRV is so easy" argument. The method itself is
harder to understand according to your numbers, and if the advocates try 
to shift the goal to "as easy as 1-2-3", well, then Condorcet is just as 
easy because the front-end is the same.

> Now, one thing in this table gives me pause: the result for plurality.
> Sure, approval and Borda are simple and intuitive for most people; but
> are they really more so than plurality? I suspect that this may reflect
> a flaw in my experiment. People assigned to plurality may, as they take
> the survey, still be very hazy on what "voting method" means. If all
> they've ever seen is plurality, it's hard for them to imagine something
> different. So they may effectively be answering a different question...
> something like, "How easy was it to understand this experiment as a whole?"
>
> However, I think that the rest of the numbers here are reliable. So
> clearly, IRV is hard to understand, and Approval and Borda are easy.
>
> Now, for the question "How easy was it to figure out *how to vote* in
> {{methName}}?":
>
>          trt    means  M
> 1 approval  214.8881  a
> 2 score     211.1531  a
> 3 borda     206.8429 ab
> 4 MAV       195.7717 ab
> 5 plurality 190.6970 ab
> 6 condorcet 167.6250  b
> 7 IRV       163.4531  b
>
> Generally, rated methods are at the top, ranked ones are at the bottom;
> though Borda may be (perceived to be) an exception. Again, we can't
> entirely rely on the number for plurality.

Seems reasonable. I find ranking easier than rating (less to worry about 
whether I got the scale wrong), but I might well be in the minority.

> Finally, the question "How *fair* did {{methName}} seem to you?":
>
>          trt    means  M
> 1 borda     209.0000  a
> 2 MAV       206.8587  a
> 3 approval  206.0571  a
> 4 condorcet 200.2721  a
> 5 score     189.3776 ab
> 6 plurality 158.3939  b
> 7 IRV       157.6094  b
>
> Again, Approval comes in among the best, and IRV among the worst.
> Surprisingly, score is not significantly better than plurality/IRV
> (though it also isn't significantly worse than the best). In this case,
> though we still have to take the plurality numbers with a grain of salt,
> I think it's fair to give them some credence. Even if people were
> answering the question "How fair did the results of this experiment seem
> to you?", it's not unreasonable to lay whatever unfairness they saw at
> the feet of plurality.

However, that may also show that the Turkers aren't good at evaluating
fairness. They consider Borda among the best, but we know about its
extreme teaming incentive. OTOH, they also consider IRV in the Plurality 
class. I could understand either judgement, but both at the same time is 
quite unexpected.

> I think these numbers are certainly interesting. To me, they clearly
> bolster the case for joining forces behind approval activism, and for
> eschewing IRV as an activist strategy; even for the majority of us who
> see some other system as ultimately better than approval.

Right. Approval is a simple fix on Plurality, gives the best bang for 
the buck, and is easily understood. I think the greatest risk to
Approval is a scenario where it is implemented, the chicken dilemma
makes it dangerously unstable, and after having gone the wrong way a few 
times due to voters mis-anticipating each other, it is repealed in a 
similar way to how Burlington repealed IRV.

Maybe your strategy data will provide information on how realistic that
scenario is.


More information about the Election-Methods mailing list