[EM] Participation criterion and Condorcet rules

Wed Sep 5 10:39:58 PDT 2018

On 9/1/2018 3:41 PM, Kristofer Munsterhjelm wrote:
 > On 2018-08-16 06:23, VoteFair wrote:
 >> On 8/14/2018 7:28 AM, Kristofer Munsterhjelm wrote:
...
 > I was under the impression that VoteFair's standard was to rank
 > candidates according to their Kemeny score, e.g. if there's an order
 > starting with A with Kemeny score 91 and another starting with B with
 > Kemeny score 92, you get B>A. ....
 >
 > But from what you're saying, it seems that I'm wrong, and that you're
 > instead aiming to preserve the actual Kemeny ordering, i.e. if A>B>C>D
 > is the unique ordering with the highest Kemeny score, VoteFair should
 > return A>B>C>D in that order, even if, say, the greatest Kemeny score
 > ordering starting in C has a greater Kemeny score than the greatest
 > ordering starting in B.

That's right, only the highest score is significant.  And it is 
associated with a ranking or sequence, and NOT associated with a 
specific candidate.

Specifically, in VoteFair popularity ranking, the sequence that is 
associated with the highest "sequence score" indicates the full ranking 
from most popular to least popular.

For clarification, in the method described by John Kemeny, the sequence 
associated with the LOWEST "Kemeny score" indicates the full ranking. 
Remember that the method described by John Kemeny counts opposition 
rather than support.

 > As for dealing with ties, I can think of two ways of doing so. ...

Keep in mind that there are several kinds of ties.

One kind of tie that can happen in the Condorcet-Kemeny method is when 
more than one sequence has the same highest "sequence score" (or the 
lowest "Kemeny score").

The method described by John Kemeny does not deal with how to resolve 
such a tie.

The VoteFair ranking software, which is available as open-source code, 
does specify how to deal with that kind of tie.

Another kind of tie is when two (or more) candidates are tied -- either 
for the winning seat or at any other ranking that might be significant.

I think this is the kind of tie your suggestions refer to.  If so, ...

My recommended approach is to first recount the ballots, and then, if 
there is still the same tie, a judicial court can cast a tie-breaking 
ballot.  In the case of any method that uses 1-2-3 ballots, the 
tie-breaking ballot cannot rank two candidates at the same preference level.

If instead you are thinking of something like the ties that can occur in 
IRV where an elimination round can involve a tie, then such a 
tie-breaking method that looks deeper into the ballots to resolve the 
tie is actually a technique for overcoming the fact that the raw method 
itself ignores very significant preference information.

The Condorcet-Kemeny method does not have that weakness.  Basically the 
method looks "deeply" into all the ballots.  As a result, there is not 
any preference information being ignored that should be reconsidered as 
part of a method-specific way to resolve such a "tie."

 >> Actually measurements of HOW OFTEN each vote-counting method fails each
 >> fairness criterion is not well-known.
 >
 > The problem with such a calculation ...

Later, when I have time again, I'll reply to this part of your message. 
And I'll start a new thread because this topic applies to all methods.

Once again, thank you Kristofer for your great contributions to this forum!

Richard Fobes

9/1/2018 3:41 PM, Kristofer Munsterhjelm wrote:
> On 2018-08-16 06:23, VoteFair wrote:
>> On 8/14/2018 7:28 AM, Kristofer Munsterhjelm wrote:
>>> As I said, it's always NP-hard, because NP-hard is a *worst-case
>>> property*.
>>
>> If you mean that the Condorcet-Kemeny METHOD is NP-hard, yes, I agree.
>
> Method X being NP-hard means "if you're given an algorithm that can
> solve any instance of X in polynomial time, you can solve any decision
> problem that is in NP in polynomial time, as well". Since these
> reductions are on a method level rather than on an instance level,
> NP-hardness is only defined for problems as a whole.
>
> So you can't say that a problem instance is or isn't NP-hard. Hence "the
> METHOD is NP-hard" is the only way one can speak of NP-hardness.
>
>> Yet critics claim (or imply) that if the Condorcet-Kemeny method were
>> used in real-life elections, then the computation time would be too long.
>>
>> As I've pointed out, long computation times for calculating a single
>> winner when there are more than about 50 (or maybe 200) ballots would be
>> very rare (regardless of how many candidates there are).
>
> The Conitzer et al. paper I linked to in my previous post makes a
> similar point. Because the mixed integer programming approach can
> quickly handle instances up to 40 candidates (using a good solver), even
> with voting patterns close to a tie, runtime is not the problem in
> itself. The only ways this could count against Kemeny is if we can find
> another method that is not NP-hard and is otherwise as good as Kemeny,
> or if the context is so that we deal with enormous numbers of candidates
> (e.g. web page ranking).
>
> My argument against Kemeny would probably be a combination of the former
> above (there are simpler methods that do just as well) and that Kemeny
> itself isn't cloneproof, so it's more vulnerable to tactical nomination
> than methods that are cloneproof. Thus I'd prefer MAM/Ranked Pairs,
> which is cloneproof, simpler to implement, and always polytime if you
> accept tiebreaking by random voter hierarchy.
>
> Of course, implicit in this is my value judgement that consistency is
> not all that important a criterion. Since every Condorcet method fails
> reinforcement, it seems unlikely that someone would say "I like methods
> where, if you have two districts and the same winner comes up in both,
> the election with the two ballot sets combined should also have the same
> winner. No Condorcet method does that, but at least Kemeny does it if
> the complete social order is the same in both districts, so I'm going to
> choose that one". Others might disagree with me.
>
>>
>>  > ....
>>
>>  > 1: A>B>D>E>C
>>  > 1: A>C>B>D>E
>>  > 1: A>D>C>B>E
>>  > 1: A>E>B>D>C
>>  > 1: B>E>C>A>D
>>  > 1: B>E>C>A>D
>>  > 1: C>A>D>B>E
>>  > 1: C>B>E>A>D
>>  > 1: C>B>E>A>D
>>  > 1: D>C>A>B>E
>>  > 1: D>C>B>E>A
>>  > 1: D>C>E>B>A
>>  > 1: D>E>A>B>C
>>  > 1: D>E>A>B>C
>>  > 1: E>C>A>D>B
>>  > 1: E>C>D>B>A
>>  > 1: E>D>B>C>A
>>  >
>>  > According to Kemeny score, there's a tie for first between D and E
>>  > (score 93 for orderings D>C>B>E>A and E>C>A>D>B), but VoteFair says that
>>  > C (score 92 for C>E>A>D>B) is the unique winner.
>>
>> The method created by John Kemeny does not specify how to deal with
>> ties.  (If someone knows otherwise, I'd love to read what Kemeny wrote
>> about handling ties.) >
>> The VoteFair ranking software DOES deal with ties.
>
> I was under the impression that VoteFair's standard was to rank
> candidates according to their Kemeny score, e.g. if there's an order
> starting with A with Kemeny score 91 and another starting with B with
> Kemeny score 92, you get B>A. This would make sense given how you said
> that VoteFair was primarily concerned with winners rather than full
> orderings, because ranking by Kemeny score would show how good a claim
> each candidate has on being the winner.
>
> But from what you're saying, it seems that I'm wrong, and that you're
> instead aiming to preserve the actual Kemeny ordering, i.e. if A>B>C>D
> is the unique ordering with the highest Kemeny score, VoteFair should
> return A>B>C>D in that order, even if, say, the greatest Kemeny score
> ordering starting in C has a greater Kemeny score than the greatest
> ordering starting in B.
>
> As for dealing with ties, I can think of two ways of doing so. The first
> is to use random voter hierarchy to construct a tiebreaker ordering and
> adding an epsilon (less than 1/numvoters^2) to every pairwise contest
> that agrees with the tiebreaker ordering. The second is to find every
> ordering that has maximum Kemeny score and recursively apply Kemeny to
> the orderings. Both tiebreaking methods preserve (their own) variant of
> consistency.
>
> The former would pass that if the tiebreak is the same for both, or if
> one district has a certain ordering be the unambiguous winner (no ties)
> and the other has that ordering tied with some others, and has that
> ordering as the RVH tiebreak, then the result for the combined district
> will be the same as for those two districts.
>
> The latter would pass it if both districts are ties with the same set of
> orderings tied for maximum Kemeny score, then the combined result would
> also have that set of orderings tied for maximum Kemeny score, and so
> the tiebreaker would provide the same result.
>
> It seems your approach is more in the spirit of the latter (as combining
> A>B>C>D and A>B>D>C to A>B>C=D looks like some kind of aggregation),
> except that the Kemeny method only returns strict orderings, and so
> would still need some final tiebreak or adjustment to deal with that
> scenario.
>
>> Actually measurements of HOW OFTEN each vote-counting method fails each
>> fairness criterion is not well-known.
>
> The problem with such a calculation is that it depends on the
> distribution of ballots. Formally speaking, the chance of getting a
> wrong scenario with k voters and n candidates is
>
> integral over all k-voter, n-candidate ballot sets x: p(x) * L(x) dx
>
> where L(x) is 1 if the method gets it wrong and 0 otherwise. We may
> replace the integral with an averaging operation if voter weights are
> always integer-valued (as in a real election).
>
> However, p(x), the probability of encountering ballot x, depends on how
> voters behave, and how the method induces the voters to behave, and
> opens up a considerable element of ambiguity. That's the problem with
> determining how often something happens. The criterion compliance
> approach to getting around this problem is to ensure that L(x) is zero
> for all possible x, as then the integral (or mean) will be zero no
> matter what.
>
> If we just want a raw count, we can let p be some predefined probability
> distribution, e.g. the Dirichlet distribution corresponding to choosing
> n! random numbers subject to that they sum up to k, or the discrete
> impartical culture which consists of picking a random ordering of the
> candidates and letting that be a ballot, k times. But such choices are
> still vulnerable to the claim that this amplifies/attenuates realistic
> regions and attenuates/amplifies unrealistic ones.
>
> For VoteFair, I think the value of the integral will be small for any
> reasonable probability distribution, at least unless there are too many
> candidates. But if we, instead of using a method that always runs in
> polynomial time but sometimes gets it wrong, use a method that always
> gets it right but sometimes takes a long time, then we would sidestep
> the whole issue entirely. So why not do so?
>
>> The tricky part of such an analysis is for the criteria that involve two
>> related scenarios, such as with and without a clone candidate.
>
> There are two ways of doing that, I think. To use monotonicity as an
> example:
>
> The first is to consider ballot set X to exhibit a monotonicity failure
> if there exists some ballot set Y so that:
> 	- the method elects some candidate A when given ballot set X
> 	- Y is equal to X, except that on some ballots, A has been raised
> (moved up in rank)
> 	- the method elects someone else than A when given ballot set Y
>
> (and similarly for X not electing A and then lowering A makes A win)
>
> The second is to consider all pairs X, Y so that
> 	- the method elects A when given ballot set X
> 	- Y is equal to X, except that on some ballots, A has been raised
> (moved up in rank)
>
> and then consider how many of these (X,Y) pairs have the method elect
> someone else than A when Y is given to the method, relative to the
> number of (X,Y) pairs possible.
>
> I prefer the former, because the latter doesn't mirror what really
> happens. The voters vote, and then the result is called; they don't
> vote, get a result, and then modify their ballots to (e.g.) raise the
> winner.
>