[EM] Calculated failure/success rates using randomized ballots and candidates

Mon Jan 4 15:03:58 PST 2021

Please consider merging your code with Warren D. Smith's election simulator
available at https://www.rangevoting.org/IEVS/IEVS.c

On Mon, Jan 4, 2021 at 12:15 AM VoteFair <electionmethods at votefair.org>
wrote:

> I've written a C++ program that generates randomized ballots, feeds
> these ballots to a separate program that calculates a winner according
> to various vote-counting methods, compares the results, and calculates
> failure/success rates.
>
> The program does two kinds of tests:
>
> * IIA: Tests successes/failures according to the Independence of
> Irrelevant Alternatives (IIA) criterion.  Specifically it calculates
> which candidate would win with all the candidates, and then it removes
> each of the non-winning candidates, one by one, to test whether a
> different candidate would win.  If any of the comparisons yield a
> different result -- such as a 6-candidate contest giving a different
> winner compared to the 7-candidate contest that uses the same ballots
> (with one candidate omitted in the 6-candidate contest) -- then that's
> counted as one failure.
>
> * "Agree/Disagree": Tests how often one counting method yields a winner
> who is the same as, or different from, the winner according to another
> vote-counting method.
>
> ASSUMPTIONS/CONDITIONS:
>
> The randomized ballots assume that the voters are expressing their
> sincere preferences, without any tactical voting, and without ranking or
> rating two candidates at the same preference level.
>
> For both tests, when a tie occurs for the winner of either test case,
> the tied case is ignored.  This means ties do not count as a failure or
> a success.
>
> For my purposes I've used 4,000 randomized cases per test, and each test
> uses 17 ballots.  Unless otherwise specified, the test (or full test in
> the case of IIA) uses 7 candidates.
>
> IIA RESULTS:
>
> The Independence of Irrelevant Alternatives (IIA) success rate of the
> Condorcet-Kemeny method is 79%, which is a failure rate of 21%.
>
> The IIA success rates of the following methods are all about 1% or 2%,
> which is a failure rate of 99% or 98%:
>
> * IRV: Instant Runoff Voting
>
> * IPE: Instant Pairwise Elimination (described in ElectoWiki)
>
> * IRMPL: Instant Runoff Minus Pairwise Loser, which uses PLE (see below)
> as a safety net under IRV.
>
> * STAR: Score Then Automatic Runoff
>
> Another method, PLE, which is an abbreviation for Pairwise Loser
> Elimination, has a calculated 100% success rate, which is zero failures.
>   This method successively eliminates the Condorcet loser, one round at
> a time, and stops with a tie when it encounters a Condorcet
> (rock-paper-scissors-like) cycle.  This perfect success rate occurs
> because tied cases are ignored, which leaves only cases that have no
> cycles at any level, which means the method finds the Condorcet winner
> in both the 7-candidate case and the 6-candidate case.
>
> Conclusion: Methods that eliminate one candidate at a time frequently
> fail the Independence of Irrelevant Alternatives (IIA) criterion.  In
> contrast, the Condorcet-Kemeny method, which uses all the pairwise
> counts (not just the biggest or smallest pairwise counts or differences
> between pairwise counts), yields a dramatically better IIA success rate.
>
> Important: In real elections the success rates would be higher -- there
> would be fewer failures -- because real elections have meaningful
> differences between candidates.  Remember that this software randomizes
> the ballots without any bias.  This means that almost all the test cases
> are "semi-balanced" or "sitting on the fence" (or maybe "finding the
> highest sand dune rather than finding the highest mountain") kinds of
> cases.
>
> Advocates of STAR voting may claim these numbers are not meaningful for
> STAR voting because STAR voting uses Score ballots, not ranked ballots.
> (On Score ballots the gap between preference levels is significant.)
> Yet fans of STAR voting also claim that its use of a top-two runoff
> discourages tactical voting, particularly the tactic of favoring the use
> of high and low preference levels, and avoiding the use of middle
> preference levels.  Keeping in mind that these tests assume the voters
> are voting sincerely, I believe these two claims are contradictory.
> (Feedback on this or any other part of this message is welcome.)
>
> AGREE/DISAGREE RESULTS:
>
> Below are the results from the "Agree/Disagree" test, which in my tests
> compare the Condorcet-Kemeny winner with the winner from each of the
> indicated vote-counting methods.  Specifically the "agree" percentages
> refer to matches with the Condorcet-Kemeny winner, and the "disagree"
> percentages apply when the method identifies a different winner.  (By
> definition the Condorcet-Kemeny method would yield 100% agreement.)
>
> About the "ties" numbers specified in parentheses:  They are counts out
> of 4,000 cases, not percentages.  These tied cases are not counted in
> either the success or failure percentages.
>
> Note that when there are only two candidates, all the methods always agree.
>
> number of candidates: 2
> IPE agree/disagree: 100%  0%  (0 ties)
> IRMPL agree/disagree: 100.0%  0%  (0 ties)
> STAR agree/disagree: 100.0%  0%  (0 ties)
> IRV agree/disagree: 100.0%  0%  (0 ties)
> PLE agree/disagree: 100.0%  0%  (0 ties)
>
> number of candidates: 3
> IPE agree/disagree: 95.1%  4.8%  (0 ties)
> IRMPL agree/disagree: 95.7%  4.2%  (0 ties)
> STAR agree/disagree: 95.2%  4.7%  (296 ties)
> IRV agree/disagree: 93.0%  6.9%  (643 ties)
> PLE agree/disagree: 100.0%  0%  (286 ties)
>
> number of candidates: 4
> IPE agree/disagree: 92.5%  7.4%  (59 ties)
> IRMPL agree/disagree: 91.3%  8.6%  (9 ties)
> STAR agree/disagree: 94.8%  5.1%  (440 ties)
> IRV agree/disagree: 84.0%  15.9%  (1582 ties)
> PLE agree/disagree: 100.0%  0%  (943 ties)
>
> number of candidates: 5
> IPE agree/disagree: 92.3%  7.6%  (103 ties)
> IRMPL agree/disagree: 88.9%  11.0%  (14 ties)
> STAR agree/disagree: 93.6%  6.3%  (435 ties)
> IRV agree/disagree: 77.7%  22.2%  (2485 ties)
> PLE agree/disagree: 100.0%  0%  (1724 ties)
>
> number of candidates: 6
> IPE agree/disagree: 90.6%  9.3%  (172 ties)
> IRMPL agree/disagree: 84.9%  15.0%  (27 ties)
> STAR agree/disagree: 91.4%  8.5%  (420 ties)
> IRV agree/disagree: 69.7%  30.2%  (3203 ties)
> PLE agree/disagree: 100.0%  0%  (2513 ties)
>
> number of candidates: 7
> IPE agree/disagree: 88.7%  11.2%  (219 ties)
> IRMPL agree/disagree: 81.6%  18.3%  (67 ties)
> STAR agree/disagree: 89.4%  10.5%  (441 ties)
> IRV agree/disagree: 59.5%  40.4%  (3517 ties)
> PLE agree/disagree: 100.0%  0%  (3063 ties)
>
> As the number of candidates increases, the methods more often disagree
> with the Condorcet-Kemeny method.  So the bottom numbers, where there
> are 7 candidates, are the most revealing.
>
> The bottom numbers show that IRV -- Instant Runoff Voting -- is the
> worst of these methods.  It agrees in about 60 percent of the non-tie
> cases.  The other three methods -- IPE, IRMPL, and STAR -- have similar
> success rates of about 80 or 90 percent.
>
> Of course this result -- that IRV is not a good vote-counting method --
> is not surprising.  Yet it's nice to have numeric confirmation.
>
> LINKS:
>
> Here are links to the two programs used in these tests:
>
>
> https://github.com/cpsolver/VoteFair-ranking-cpp/blob/master/generate_random_ballots.cpp
>
>
> https://github.com/cpsolver/VoteFair-ranking-cpp/blob/master/votefair_ranking.cpp
>
> GOAL:
>
> My hope is that this software takes us a step closer to yielding numbers
> to better characterize HOW OFTEN each vote-counting method passes or
> fails each of the "fairness" criteria, the ones that are currently
> flagged as "yes" or "no" in this comparison table:
>
>
> https://en.wikipedia.org/wiki/Comparison_of_electoral_systems#Compliance_of_selected_single-winner_methods
>
> I realize the numbers calculated by my software are not suitable as
> estimates for real-life elections -- because randomized ballots and
> randomized candidates do not match real-life elections.  Yet these
> calculated numbers provide a peek at ways to compare methods more
> meaningfully than just flagging methods as pass-or-fail.
>
> THANKS:
>
> If you find any software bugs, please tell me, either here or on GitHub.
>
> Feedback is welcome.  That's why I've posted these results here.
>
> Richard Fobes
> ----
> Election-Methods mailing list - see https://electorama.com/em for list
> info
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20210104/ed9e9975/attachment.html>