[EM] Scatter plot of clone independence versus IIA

Sun Jun 27 10:32:44 PDT 2021

On 22.06.2021 20:36, VoteFair wrote:
> On 6/20/2021 6:53 AM, Kristofer Munsterhjelm wrote:
>> ...
>> I'd say the three clone failure types are:
>>
>> Suppose A is cloned into A1 and A2. Then, for any two other candidates B
>> and C from the original election:
>>
>> 1. If A won before cloning, but B wins after, that's vote-splitting.
>> 2. If B won before cloning, but A1 or A2 wins after, that's teaming.
>> 3. If B won before cloning, but C wins after, that's crowding.
>>
>> Everything else (A->A1, A->A2, B->B, C->C) is a pass.
> 
> 
> I see four categories:
> 
> 1.  A wins without clone, but B wins with clone: clone "hurts" similar
> candidate (yes usually because of vote splitting but I'm not sure that's
> always the case)
> 
> 2.  B wins without clone, but A or A1 or A2 wins with clone: clone
> "helps" similar candidate (A/A1) (yes usually intentionally if teaming
> used, but can be accidental)
> 
> 3.  B wins without clone, but C wins with clone: similar to IIA (I don't
> have a name for this category)
> 
> 4.  A wins without clone, but A2 wins with clone: clone "displaces"
> similar candidate

The third category is usually called "crowding". This terminology seems
to be due to Blake Cretney, see
http://lists.electorama.com/pipermail/election-methods-electorama.com//2002-February/105357.html.
The earliest instance I can find of his jargon page is this:
https://web.archive.org/web/19991001204109/http://www.fortunecity.com/meltingpot/harrow/124/defn.html

As for the fourth, I don't think that's a failure, because for any
election where A is cloned into A and A2, there's an equivalent election
where A is cloned into A2 and A. Since clones only obey the property
everybody ranks them next to each other, it's not possible to determine
who's the original and who's the clone; to distinguish A and A2 would be
a failure of the neutrality criterion.

Or to put it differently, suppose you have an election like this, call
it election eX:

45: A>B>C
33: B>C>A
30: C>A>B

Without loss of generality, suppose A wins. Now make two clone
elections, eY:

45: A>A2>B>C
33: B>C>A2>A
30: C>A>A2>B

and eZ:

45: A2>A>B>C
33: B>C>A>A2
30: C>A2>A>B

Now in eY, either a member of the clone set wins with positive
probability, or someone outside the clone set does. If it's the latter,
you have a straightforward clone failure. So suppose the winner/s in eY
is/are part of the clone set.

If A and A2 tie, then A has gone from winning with certainty to winning
with 50% probability, which is a failure of the fourth criterion. If A2
wins, then it's also a failure of the fourth criterion. This only leaves
the possibility that A wins in eY.

But if A wins, then by neutrality A2 must win in election Z, which would
be a failure of the fourth criterion. So no method can be cloneproof: it
will either fail when going from eX to eY, or when going from eX to eZ.

>> James Green-Armytage's paper on IRV shows that even though IRV is
>> theoretically cloneproof, it's unusually vulnerable to candidate exit
>> (where similar candidates leave the race to get a candidate elected).
>> Does your plot try to find such "near clone independence" failures?
> 
> No it does not try to isolate this kind of failure.  Instead, the IIA
> (independence of irrelevant alternatives) test does this kind of testing
> where removing any candidate is a failure.  In the IIA tests there are
> no clones, so no my tests don't measure this specific kind of failure.
> Such failures would just contribute toward the IIA failures.
> 
>> If it doesn't, then something's still off: IRV is strictly speaking
>> clone independent, so its clone independence rate should be 100%.
> 
> This bothers me too, yet I've looked for bugs that might affect this and
> I haven't found any -- although of course that doesn't mean they aren't
> there.
> 
> If I were only testing IRV failures without comparing the results to
> other methods using the same ballots, then I could simply ignore cases
> where there is a tie for the fewest transferred votes (during any of the
> intermediate, but not final, elimination rounds).  And then it would
> have 100% clone independence.
> 
> How does the academic paper resolve such ties?  Or does the math assume
> such ties do not occur?

Suppose that you break ties with a tiebreaker that itself is cloneproof.
Then the cloning doesn't get someone who would've been eliminated later,
eliminated earlier (since there's no teaming or crowding in either the
tiebreak or in Plurality). So the elimination order is the same up until
either A or A2 is eliminated and the other one remains. After that
happens, the remaining candidate gets all of the clone set's first
preference votes and the method proceeds as it would without the clone.

Finding an appropriate tiebreaker is the difficult part. I don't know
how JGA did it, but here's a suggestion that should be cloneproof.

Let's say the tiebreaker consists of a ranking, where in a tie between
some candidates, you eliminate the tied candidate that's ranked last on
that ranking. You have one ranking before the cloning and another one after.

Then the proof says that as long as you eliminate candidates one at a
time, and the rankings before and after are the same, except that the
candidate cloned is replaced with the clone set in some order, IRV will
be cloneproof.

The usual way to get such a ranking would be to pick a random ballot and
use the ranking on that ballot (completing it with random voter
hierarchy if there's equal-rank or truncation:
https://electowiki.org/wiki/Maximize_Affirmed_Majorities#Random_Voter_Hierarchy_tiebreak_procedure).

So you could implement this using random ballot to create the ranking
before the cloning, and use the exact same ballot to create the ranking
after the cloning.

One could argue that this leaves too much to chance: it could be that
IRV is cloneproof only for some random ballot rankings, but not for
every such ranking. But the proof shows that it's either cloneproof for
none of them or for all of them. So in the particular case of IRV, this
will suffice to show that IRV is cloneproof - the randomness has no
effect as far as clone resistance goes.

> I do ignore cases where IRV yields a tie for winner.  And also I ignore
> cases where the Condorcet-Kemeny method or plurality yield a tie for who
> wins.  That makes sense because those cases are actual ties.

I think you can handle clone failure in the presence of winning ties.

If the winner set doesn't contain A, you clone A, and now it contains A,
then that's teaming.

If the winner set contains A, you clone A, and A is kicked off the set,
that's vote-splitting.

If you clone A and someone who isn't A is either kicked off or added to
the winner set, then that's crowding.

This doesn't properly classify some clone failures, but some of them
defy classification into just one category when you're dealing with
winner ties. For instance, suppose the winner set before cloning is A
and B; and after, it is A, B, and C. This has aspects both of
vote-splitting (because the chance of a tie-breaker choosing A is
reduced) and crowding (because C got admitted).

More generally, suppose you have a cloneproof tiebreaker like Random
Ballot (choose a random ballot, pick the winner who's listed first on
it). Then something like:

- if the probability that the winner comes from the clone set after
cloning is reduced, that's vote-splitting;

- if the probability that the winner comes from the clone set after
cloning is increased, that's teaming;

- and if the conditional probability of electing some candidate X not in
the clone set, given that you elect a candidate outside the clone set,
changes, that's crowding.

> But I can't justify ignoring cases that are problematic for just one
> method.  That could cause higher failure rates for the other methods,
> especially the ones that include tie-breaking rules.  That would not be
> a fair comparison across the different methods.
> 
> I regard this as one of the reasons why such measurements have not
> already been done.  It's challenging!

There's another problem as well that I've been mentioning numerous times
:-) When you count "how often" or "how likely", you have to assume some
probability distribution - i.e. how often a particular election is going
to appear. This probability distribution may well change as voters come
to learn the method.

FairVote makes a big deal (as I understand it) about IRV tending to
elect Condorcet winners. However, this is under the probability
distribution inherited from Plurality, where you have two large parties
and a bunch of fringe ones whose only danger is splitting the vote of
major-party voters.

Given some time, the voters may think that now they can vote for third
parties without any harm; but when a third party gets large enough and
is everybody's second choice, cue Burlington.

How often does IRV fail to elect the Condorcet winner? That depends on
whether there are any third parties around.

So any "how often" calculation should be qualified by noting just what
kind of distribution/voter behavior the calculation assumes. Whether
it's uniform utilities, impartial culture, something spatial, or based
on polling data like Armytage's analysis.

> Yet being able to yield numeric comparisons is essential to move beyond
> the simplistic zero versus non-zero failure-rate arguments.
> 
> Kristofer, as always I greatly appreciate your wise feedback!!

Thank you. :-)

-km