[EM] Condorcet-Kemeny calculations are not NP-hard

Thu Sep 29 00:43:57 PDT 2011

Contrary to common belief, solving the Condorcet-Kemeny problem is _not_ 
NP-hard.  It can be solved in polynomial time, even in the most 
challenging cases.

To demonstrate (but not yet prove) the validity of this claim I offered 
to take on Warren Smith's "Kemeny challenge" if he would supply specific 
meaningful ballot information for a challenging 100-candidate case. 
Warren has not yet done so.  (One way to provide such data would be to 
output ballot data from his simulation software when it encounters a few 
especially challenging cases.)

Warren's proposed pairwise-count data pattern is not usable because my 
software is designed for real-life use, not academic analysis.  I need 
ballot data, and my software generates the corresponding pairwise 
counts.  (Even if I could plug in pairwise counts, I would want to 
ensure that the pairwise counts can arise from specific ballot data.)

Of course proving (not just demonstrating) that Condorcet-Kemeny 
calculations are not NP-hard requires revealing how the calculations can 
be done in polynomial time.  Soon I will be revealing how that can be done.

If I were a university professor, years ago I would have published an 
academic paper that reveals a polynomial-time way to calculate any 
Condorcet-Kemeny sequence because I would have the time and financial 
support that such publications require.  For better or worse, I am in 
the business world, not the academic world.  The fact that I do not get 
financial support for my election-method reform efforts also accounts 
for the delays in my responding to questions in this forum that require 
detailed explanations.

Even though I am not a university professor, I have a Bachelor of 
Science degree in Physics from the University of California at Davis, so 
I have a strong background in understanding academic publications and I 
have a firm grasp of advanced mathematics.

This message is not the first time I have made this not-NP-hard claim. 
Years ago I made this claim in the discussion page for the Wikipedia 
article titled "Kemeny-Young method".

During the first few years that my VoteFair.org software did 
election/survey/poll calculations (using VoteFair popularity ranking, 
which is the method that is now described in Wikipedia as the 
"Kemeny-Young method"), I limited the number of choices to six choices 
for public use and 10 choices with special permission.  Then the 
American Idol TV show announced that there would be 12 singers in the 
same group (by gender, and by overall), so I had to accommodate that 
change in my American Idol poll.  After that, in most cases the full 
ranking calculations could be done without calculating all the sequence 
scores.  Then one day after just a few people had voted I encountered an 
11-choice case that would have required about 12 hours of computation 
time to resolve.  When more votes were added, that unusually cyclic case 
disappeared, yet I knew it was time to resume my efforts to find a 
faster method.  I saved that case and then put in lots of effort, 
supported by insights gained from lots of experience calculating 
VoteFair-popularity/Condorcet-Kemeny rankings, to arrive at a 
calculation method that only increases in polynomial time as the number 
of choices increases.  Since then I have further refined the method.

Currently I am preparing to publicly share the calculation method.  At 
that time I will be able to deliver the proof.

Currently what I can offer is to demonstrate the method by calculating 
-- hopefully within a day, not a lifetime -- the full Condorcet-Kemeny 
ranking for a challenging 100-candidate case (after Warren has posted 
the ballot information for such a case).  That should demonstrate that 
the calculation time for a computationally challenging 30-candidate or 
40-candidate election would be very reasonable, probably less than an hour.

In the meantime, until I have shared my proof, please keep in mind that 
there are no proofs of the contrary belief, which is that 
Condorcet-Kemeny calculations are NP-hard.  Until there is a proof one 
way or another, I suggest that the appropriate assumption should be "we 
don't yet know".

A copy of Warren's "Kemeny update" message appears below.  Note that it 
assumes that Condorcet-Kemeny calculations are NP-hard, yet he does not 
offer any proof of that claim.  (Failing to find a proof in academic 
publications does not provide proof of the opposite.)

Warren's message also includes comments on other topics besides the 
calculation time, and I addressed those in a separate message.  (To 
reiterate, VoteFair Ranking is fully and clearly described in my book 
"Ending The Hidden Unfairness In U.S. Elections"; the book is not 
intended to be a manual for software developers.)

I am not asking anyone to believe what has not yet been proven, but I am 
asking that the Condorcet-Kemeny method not be dismissed as "too slow" 
just because a fast way to calculate the results has not yet been 
published in any academic publication.

And because some of you will remain convinced otherwise until the proof 
is presented, also consider that if Warren or anyone else claims that a 
real-life single-winner election should accommodate more than 20 
candidates who each have a computational possibility of winning the 
single seat, I would claim that most of the election's voters cannot 
meaningfully mark any kind of ballot, and that any calculation method 
would produce disputable results (if there is no Condorcet winner and 
each of the 20-or-more candidates really are all possible winners).  In 
other words, even _if_ Condorcet-Kemeny calculations were impractical 
for dependably handling more than 20 candidates, Condorcet-Kemeny 
calculations can easily handle all well-designed governmental elections.

Some people have questioned why it is worth doing Condorcet-Kemeny 
calculations when there are faster Condorcet methods.  The answer is 
that Condorcet-Kemeny results offer lots of advantages, including one 
named advantage that is currently underappreciated and a significant 
advantage that has not yet been named or widely recognized.  The only 
valid disadvantage is that in very rare cases it fails the "independence 
of clones" criteria, yet this perceived disadvantage does not yield 
unfair results if all aspects of VoteFair Ranking are used.

Finally, just in case it is not obvious, of course Condorcet-Kemeny 
calculations do require non-polynomial time if all the sequence scores 
are calculated.  Yet, as I have said before, it is not necessary to 
calculate every sequence score in order to identify the sequence with 
the highest score.

Richard Fobes

On 9/19/2011 2:20 PM, Warren Smith wrote:
> To reiterate and/or answer some questions:
>
> 1. there is no way to find the Kemeny winner that is much faster than
> finding the full Kemeny ordering. More precisely, both are NP-complete tasks
> so there is no poly-time algorithm for either unless P=NP.
>
> 2. Even if some benificent God helpfully informs you of the name of
> the Kemeny winner,
> or the full Kemeny ordering, then there is no easy way for you
> to confirm or deny that claim.   No matter how much supporting information
> God provides to you (if it is only a polynomially-bounded number of
> bits) -- such
> as a proof -- there remains no way to confirm or deny either claim that runs in
> polynomial time, unless NP=coNP.  In other words, no short proofs exist.
>
> 3. It IS possible to find both the Kemeny
> ordering and Kemeny winner, in any election, if you are willing to
> devote enough compute time to it.  But the amount of time needed will
> exceed any polynomial in the #candidates.  Every currently known
> algorithm in the papers I cited fails for easy-to-generate and fairly natural
> 40-candidate elections, no matter how much time they devote to it
> within the limits of their finances.
>
> 4. The hardest elections have got Smith set = the full set of candidates.
> This is asymptotically not a great restriction since random elections have
> Smith set = full set, with probability -->1 in the limit as
> #candidates -->  infinity.
>
> 5. I posed as a challenge, a certain randomized class of 27-candidate elections
> which I designed to be hard.
> It seemed plausible to me that these 27-candidate elections might be too hard
> for current algorithms to reliably provably find the Kemeny winner.
> I of course do not know the winners of my challenge elections, since if I did,
> then a short proof would exist, which it cannot in maximally hard elections.
> So far, nobody has convinced me they are able to answer the Kemeny challenge,
> but Kristofer Munsterhjelm has preliminarily claimed in email that
> he's been able to solve
> one election sampled from my class -- which causes
> him to be optimistic that he can answer the challenge.
> We'll see -- if that is true, I don't see why K.M. hasn't completed
> the job, since
> it seems like if he's done what he thinks he has, then he's
> done 99% of the programming work
> and the remaining 1% is for some reason holding him up.
> In the event KM or somebody else successfully
> answers the challenge, then I'll be reasonably impressed,
> but then I probably can create harder challenges
> with greater numbers of candidates -- the goal is to
> try to identify the borderline between "hard" and "easy."
>
> 6. Richard Fobes has told me and the election methods group in emails
> that he's got various
> wonderful answers to all these issues, which he sadly has never found
> the time to
> explain, including in his privately published book on the subject.  I
>    (a) just don't believe him.
>    (b) some of his "great solutions" seem to be now to recommend a
> hybrid of about 3 different
> election methods -- which he has nowhere fully defined.   This hybrid
> method no longer can
> claim either simplicity or any attractive package of nice logical
> properties, but it should at least
> make it always feasible to compute the winner..
>