<div dir="ltr">As many of you know, I've been running an online voting experiment using human subjects from Amazon Mechanical Turk. I'm using a 3-candidate, 9-voter "chicken dilemma" scenario, with factions of 4, 2, and 3 voters:<div>


<font face="courier new, monospace">           </font></div><div><font face="courier new, monospace">Cand:        X  Y  Z</font></div><div><font face="courier new, monospace">Faction:   </font></div><div><font face="courier new, monospace">Red   4      3  1  0</font></div>


<div><font face="courier new, monospace">Green 2      0  3  2</font></div><div><font face="courier new, monospace">Blue  3      0  2  3</font></div><div><font face="courier new, monospace">    Size     payoffs</font></div>


<div><font face="courier new, monospace"><br></font></div><div>Each group of 9 gets assigned factions and a voting method, and runs the election 3 times, with monetary payoffs proportional to the numbers above in the last two rounds. Then they answer a survey about how fair, easy to vote, and easy to understand they found the method, plus some demographic questions.</div>


<div><br></div><div>The voting systems I have tested so far include approval, Borda, Condorcet (minimax), IRV, MAV (medians), plurality, and score. I plan to also test SODA very soon.</div><div><br></div><div>My analysis of the outcomes and strategies is not yet ready to share here. However, I have some numbers on the survey results. I used a Kruskall-Wallis comparison test, appropriate for Likert-scale results like these. Here are the results for the question "<span class="">How easy was it to </span><span class=""><b>understand</b></span><span class=""> {{methName}} (the voting system you used)?":</span></div>


<div><span class=""><br></span></div><div><span class=""><div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace">        trt    means   M</font></div><div><font face="courier new, monospace">1 approval  231.2388   a</font></div>


<div><font face="courier new, monospace">2 borda     229.7286   a</font></div><div><font face="courier new, monospace">3 score     205.4898  ab</font></div><div><font face="courier new, monospace">4 MAV       195.5761 abc</font></div>


<div><font face="courier new, monospace">5 plurality 172.9545  bc</font></div><div><font face="courier new, monospace">6 condorcet 165.3897   c</font></div><div><font face="courier new, monospace">7 IRV       118.8281   d</font></div>


<div><br></div><div>The important thing about the above table are the letters at the end. If two systems share at least one letter in common, the differences between those systems are not statistically significant. So we can safely say, for instance, that Approval and Borda are easier to understand than Condorcet, but we can't tell whether MAV is as understandable as the former or as confusing as the latter.</div>


<div><br></div><div>Now, one thing in this table gives me pause: the result for plurality. Sure, approval and Borda are simple and intuitive for most people; but are they really more so than plurality? I suspect that this may reflect a flaw in my experiment. People assigned to plurality may, as they take the survey, still be very hazy on what "voting method" means. If all they've ever seen is plurality, it's hard for them to imagine something different. So they may effectively be answering a different question... something like, "How easy was it to understand this experiment as a whole?"</div>


<div><br></div><div>However, I think that the rest of the numbers here are reliable. So clearly, IRV is hard to understand, and Approval and Borda are easy.</div><div><br></div><div>Now, for the question "<span class="">How easy was it to figure out </span><span class=""><b>how to vote</b></span><span class=""> in {{methName}}?":</span></div>


<div><br></div><div><span class=""><div><font face="courier new, monospace">        trt    means  M</font></div><div><font face="courier new, monospace">1 approval  214.8881  a</font></div><div><font face="courier new, monospace">2 score     211.1531  a</font></div>


<div><font face="courier new, monospace">3 borda     206.8429 ab</font></div><div><font face="courier new, monospace">4 MAV       195.7717 ab</font></div><div><font face="courier new, monospace">5 plurality 190.6970 ab</font></div>


<div><font face="courier new, monospace">6 condorcet 167.6250  b</font></div><div><font face="courier new, monospace">7 IRV       163.4531  b</font></div><div><br></div><div>Generally, rated methods are at the top, ranked ones are at the bottom; though Borda may be (perceived to be) an exception. Again, we can't entirely rely on the number for plurality.</div>


<div><br></div><div>Finally, the question "<span class="">How </span><span class=""><b>fair</b></span><span class=""> did {{methName}} seem to you?":</span></div><div><span class=""><font face="courier new, monospace"><div>


<br></div><div>        trt    means  M</div><div>1 borda     209.0000  a</div><div>2 MAV       206.8587  a</div><div>3 approval  206.0571  a</div><div>4 condorcet 200.2721  a</div><div>5 score     189.3776 ab</div><div>6 plurality 158.3939  b</div>


<div>7 IRV       157.6094  b</div></font></span></div><div><span class=""><br></span></div><div>Again, Approval comes in among the best, and IRV among the worst. Surprisingly, score is not significantly better than plurality/IRV (though it also isn't significantly worse than the best). In this case, though we still have to take the plurality numbers with a grain of salt, I think it's fair to give them some credence. Even if people were answering the question "How fair did the results of this experiment seem to you?", it's not unreasonable to lay whatever unfairness they saw at the feet of plurality.</div>


<div><br></div><div>I think these numbers are certainly interesting. To me, they clearly bolster the case for joining forces behind approval activism, and for eschewing IRV as an activist strategy; even for the majority of us who see some other system as ultimately better than approval.</div>


<div><br></div><div>I'll be sharing more numbers from this experiment as I have them ready. Also, if anybody here wants access to my raw data, I'd be happy to share; though of course, I'd want you to duly cite me if you use them for anything.</div>


<div><br></div><div>Cheers,</div><div>Jameson</div></span>


</div></span>


</div>


</div>