[EM] Participation criterion and Condorcet rules

Wed Aug 15 21:23:29 PDT 2018

On 8/14/2018 7:28 AM, Kristofer Munsterhjelm wrote:
 > As I said, it's always NP-hard, because NP-hard is a *worst-case 
property*.

If you mean that the Condorcet-Kemeny METHOD is NP-hard, yes, I agree.

 > NP-hard is not always hard :-)

Yes, this is my point.

Yet critics claim (or imply) that if the Condorcet-Kemeny method were 
used in real-life elections, then the computation time would be too long.

As I've pointed out, long computation times for calculating a single 
winner when there are more than about 50 (or maybe 200) ballots would be 
very rare (regardless of how many candidates there are).

 > ....

 > 1: A>B>D>E>C
 > 1: A>C>B>D>E
 > 1: A>D>C>B>E
 > 1: A>E>B>D>C
 > 1: B>E>C>A>D
 > 1: B>E>C>A>D
 > 1: C>A>D>B>E
 > 1: C>B>E>A>D
 > 1: C>B>E>A>D
 > 1: D>C>A>B>E
 > 1: D>C>B>E>A
 > 1: D>C>E>B>A
 > 1: D>E>A>B>C
 > 1: D>E>A>B>C
 > 1: E>C>A>D>B
 > 1: E>C>D>B>A
 > 1: E>D>B>C>A
 >
 > According to Kemeny score, there's a tie for first between D and E
 > (score 93 for orderings D>C>B>E>A and E>C>A>D>B), but VoteFair says that
 > C (score 92 for C>E>A>D>B) is the unique winner.

The method created by John Kemeny does not specify how to deal with 
ties.  (If someone knows otherwise, I'd love to read what Kemeny wrote 
about handling ties.)

The VoteFair ranking software DOES deal with ties.

It turns out that ties using the Condorcet-Kemeny method can be very 
complex, and in subtle ways.

Before jumping to your complex case, it's useful to consider the simple 
case where the result sequences D>C>B>E>A and C>D>B>E>A have the same 
sequence scores.  In this case it's obvious that D and C are tied.

In your example above, where D>C>B>E>A and E>C>A>D>B have the same 
sequence score, this simplicity does not exist.

Without careful analysis I'm going to assume that the reason VoteFair 
ranking identifies C as the winner is that C is pairwise preferred over 
D and C is pairwise preferred over E.

(In a textbook the words "this exercise is left to the reader" would 
appear here.)

Regardless of whether that assumption is correct (I don't have time to 
check it), I'll repeat that you have identified a case where John 
Kemeny's described method does not specify who should win!

When I was writing the code to handle ties I encountered some bizarre 
cases where there was circular ambiguity in odd ways, such as in your 
example.

The code I wrote and posted on GitHub (in the CPSolver account) 
clarifies how I decided to resolve those tied cases.

When you claim that my software does not match what John Kemeny 
described, remember that he did not specify what to do when there is 
more than a single highest sequence score (or, more precisely, a single 
lowest "Kemeny" score since his version counts opposition rather than 
support).

I should clarify that the code at GitHub is more recent than the code at 
VoteFair.org, so there might be some differences between those two 
versions of the VoteFair ranking software.  But those differences should 
only apply to cases that involve ties.  If there is only one highest 
sequence score, both versions should produce the same results 
(remembering to consider the setting that specifies how many choices to 
consider when doing the full check-all-the-sequence-scores calculations).

Kristofer, I appreciate your rigorous checking of the Condorcet-Kemeny 
method as calculated by my VoteFair ranking software.  Thanks!

 > ....
 > .... For instance, IRV advocates sometimes claim that monotonicity
 > failure happens so rarely that IRV should be considered to pass
 > monotonicity in all real-world settings. From a mathematical point of
 > view, a method fails some property as long as there's at least one
 > example, but a common response is "but that one doesn't count, find
 > another".
 >
 > For VoteFair, I don't think it's too likely that it gets the result
 > wrong in real life. But I could be wrong. I don't know if the number of
 > wrong-call elections stay low as the number of candidates increases, or
 > if there's some strategy that exploits these. In contrast, the behavior
 > of Kemeny itself is well known.

Actually measurements of HOW OFTEN each vote-counting method fails each 
fairness criterion is not well-known.

If someone wants to tackle that ambitious project in a simpler way, it 
can be done by running simulations that start with a few number of 
ballots (maybe 9 to start with) and a few number of candidates (three), 
and testing all possible preferences for all the voters.  Then those 
numbers can be increased one by one.  That would yield measurements that 
show whether the failure rate settles into an integer percentage value 
(ignoring decimal numbers).

The tricky part of such an analysis is for the criteria that involve two 
related scenarios, such as with and without a clone candidate.

If that analysis is ever done, the difference between the "official" 
Condorcet-Kemeny method that does not handle ties, and the VoteFair 
ranking method that does handle ties, will be very tiny!

What will be more interesting is to see if there is any significant 
difference in frequencies between the insertion-sort-like-presort that 
I've used in the VoteFair ranking software, and the "official" John 
Kemeny method (with ties being avoided).

Richard Fobes

On 8/14/2018 7:28 AM, Kristofer Munsterhjelm wrote:
> On 2018-08-11 07:22, VoteFair wrote:
>> On 8/10/2018 5:58 AM, Kristofer Munsterhjelm wrote:
>>  > ....
>>  > 1: A>E>G>B>D>F>C
>>  > 1: B>F>D>A>G>E>C
>>  > 1: C>G>B>E>F>D>A
>>  > 1: F>A>E>D>B>C>G
>>  > 1: F>C>D>A>E>G>B
>>
>> This is a good example of my point that real-life elections would
>> almost never have a carefully constructed situation like this where it
>> would be "NP-hard" to correctly calculate the winner using the
>> Condorcet-Kemeny method.
>>
>> I do agree with Kristofer's correction that, in SOME cases, it can be
>> NP-hard to calculate the winner.
>
> As I said, it's always NP-hard, because NP-hard is a *worst-case property*.
>
> The subset sum problem "given a set of numbers A and a number b, find a
> subset of A that sums to b" is NP-hard (and the decision problem is
> NP-complete).
> However, if the numbers in A are so that the every number is greater in
> value than the sum of all the numbers smaller than it, then it's easy.
> All you have to do is find the greatest number less than b, add it to
> the solution, subtract this number from b, and repeat. If you end up
> with excess at the end, then there's no such subset of A that sums to b.
>
> NP-hard is not always hard :-)
>
>> Currently that setting is, I believe, without looking, 6 choices.
>> That's why the example here has 7 choices.
>
> That was mainly for convenience. I have one with 5 as well, but it's
> more messy as I haven't let the script run through as many different
> examples:
>
> 1: A>B>D>E>C
> 1: A>C>B>D>E
> 1: A>D>C>B>E
> 1: A>E>B>D>C
> 1: B>E>C>A>D
> 1: B>E>C>A>D
> 1: C>A>D>B>E
> 1: C>B>E>A>D
> 1: C>B>E>A>D
> 1: D>C>A>B>E
> 1: D>C>B>E>A
> 1: D>C>E>B>A
> 1: D>E>A>B>C
> 1: D>E>A>B>C
> 1: E>C>A>D>B
> 1: E>C>D>B>A
> 1: E>D>B>C>A
>
> According to Kemeny score, there's a tie for first between D and E
> (score 93 for orderings D>C>B>E>A and E>C>A>D>B), but VoteFair says that
> C (score 92 for C>E>A>D>B) is the unique winner.
>
> In any case, I see what you're saying, but the introduction of "real
> life cases" makes analysis much harder and can potentially open up
> loopholes. For instance, IRV advocates sometimes claim that monotonicity
> failure happens so rarely that IRV should be considered to pass
> monotonicity in all real-world settings. From a mathematical point of
> view, a method fails some property as long as there's at least one
> example, but a common response is "but that one doesn't count, find
> another".
>
> For VoteFair, I don't think it's too likely that it gets the result
> wrong in real life. But I could be wrong. I don't know if the number of
> wrong-call elections stay low as the number of candidates increases, or
> if there's some strategy that exploits these. In contrast, the behavior
> of Kemeny itself is well known.
>
> To avoid that tangle, perhaps you could use an algorithm that is always
> correct, but in a few rare cases runs for a very long time (or times out
> with "can't tell in the time allotted").
>
> That way, all properties that hold for Kemeny would also hold for
> VoteFair, and the counters would know whether they were in what a
> hard-to-tell election or not without having to calculate the exact
> Kemeny scores for comparison.
>
> The simplest way to do that, that I can think of, is to use an integer
> programming library/interface like Math::LP and set up the program for
> solving a Kemeny election.
> https://www.cs.cmu.edu/~conitzer/kemenyAAAI06.pdf page 4 gives a linear
> programming approximation to determining the Kemeny score of a
> candidate. The approximation is exact if the x_(a,b) variables are
> specified to be integers.
>
> There are also fixed-parameter tractable algorithms whose runtime is
> polynomial if the number of upsets is fixed, but exponential in the
> number of upsets, e.g.
>
> RAMAN, Venkatesh; SAURABH, Saket. Parameterized algorithms for feedback
> set problems and their duals in tournaments. Theoretical Computer
> Science, 2006, 351.3: 446-458,
>
> but coding them from scratch could be tedious.