[EM] Condorcet-Kemeny calculations are not NP-hard
Richard Fobes
ElectionMethods at VoteFair.org
Thu Sep 29 00:43:57 PDT 2011
Contrary to common belief, solving the Condorcet-Kemeny problem is _not_
NP-hard. It can be solved in polynomial time, even in the most
challenging cases.
To demonstrate (but not yet prove) the validity of this claim I offered
to take on Warren Smith's "Kemeny challenge" if he would supply specific
meaningful ballot information for a challenging 100-candidate case.
Warren has not yet done so. (One way to provide such data would be to
output ballot data from his simulation software when it encounters a few
especially challenging cases.)
Warren's proposed pairwise-count data pattern is not usable because my
software is designed for real-life use, not academic analysis. I need
ballot data, and my software generates the corresponding pairwise
counts. (Even if I could plug in pairwise counts, I would want to
ensure that the pairwise counts can arise from specific ballot data.)
Of course proving (not just demonstrating) that Condorcet-Kemeny
calculations are not NP-hard requires revealing how the calculations can
be done in polynomial time. Soon I will be revealing how that can be done.
If I were a university professor, years ago I would have published an
academic paper that reveals a polynomial-time way to calculate any
Condorcet-Kemeny sequence because I would have the time and financial
support that such publications require. For better or worse, I am in
the business world, not the academic world. The fact that I do not get
financial support for my election-method reform efforts also accounts
for the delays in my responding to questions in this forum that require
detailed explanations.
Even though I am not a university professor, I have a Bachelor of
Science degree in Physics from the University of California at Davis, so
I have a strong background in understanding academic publications and I
have a firm grasp of advanced mathematics.
This message is not the first time I have made this not-NP-hard claim.
Years ago I made this claim in the discussion page for the Wikipedia
article titled "Kemeny-Young method".
During the first few years that my VoteFair.org software did
election/survey/poll calculations (using VoteFair popularity ranking,
which is the method that is now described in Wikipedia as the
"Kemeny-Young method"), I limited the number of choices to six choices
for public use and 10 choices with special permission. Then the
American Idol TV show announced that there would be 12 singers in the
same group (by gender, and by overall), so I had to accommodate that
change in my American Idol poll. After that, in most cases the full
ranking calculations could be done without calculating all the sequence
scores. Then one day after just a few people had voted I encountered an
11-choice case that would have required about 12 hours of computation
time to resolve. When more votes were added, that unusually cyclic case
disappeared, yet I knew it was time to resume my efforts to find a
faster method. I saved that case and then put in lots of effort,
supported by insights gained from lots of experience calculating
VoteFair-popularity/Condorcet-Kemeny rankings, to arrive at a
calculation method that only increases in polynomial time as the number
of choices increases. Since then I have further refined the method.
Currently I am preparing to publicly share the calculation method. At
that time I will be able to deliver the proof.
Currently what I can offer is to demonstrate the method by calculating
-- hopefully within a day, not a lifetime -- the full Condorcet-Kemeny
ranking for a challenging 100-candidate case (after Warren has posted
the ballot information for such a case). That should demonstrate that
the calculation time for a computationally challenging 30-candidate or
40-candidate election would be very reasonable, probably less than an hour.
In the meantime, until I have shared my proof, please keep in mind that
there are no proofs of the contrary belief, which is that
Condorcet-Kemeny calculations are NP-hard. Until there is a proof one
way or another, I suggest that the appropriate assumption should be "we
don't yet know".
A copy of Warren's "Kemeny update" message appears below. Note that it
assumes that Condorcet-Kemeny calculations are NP-hard, yet he does not
offer any proof of that claim. (Failing to find a proof in academic
publications does not provide proof of the opposite.)
Warren's message also includes comments on other topics besides the
calculation time, and I addressed those in a separate message. (To
reiterate, VoteFair Ranking is fully and clearly described in my book
"Ending The Hidden Unfairness In U.S. Elections"; the book is not
intended to be a manual for software developers.)
I am not asking anyone to believe what has not yet been proven, but I am
asking that the Condorcet-Kemeny method not be dismissed as "too slow"
just because a fast way to calculate the results has not yet been
published in any academic publication.
And because some of you will remain convinced otherwise until the proof
is presented, also consider that if Warren or anyone else claims that a
real-life single-winner election should accommodate more than 20
candidates who each have a computational possibility of winning the
single seat, I would claim that most of the election's voters cannot
meaningfully mark any kind of ballot, and that any calculation method
would produce disputable results (if there is no Condorcet winner and
each of the 20-or-more candidates really are all possible winners). In
other words, even _if_ Condorcet-Kemeny calculations were impractical
for dependably handling more than 20 candidates, Condorcet-Kemeny
calculations can easily handle all well-designed governmental elections.
Some people have questioned why it is worth doing Condorcet-Kemeny
calculations when there are faster Condorcet methods. The answer is
that Condorcet-Kemeny results offer lots of advantages, including one
named advantage that is currently underappreciated and a significant
advantage that has not yet been named or widely recognized. The only
valid disadvantage is that in very rare cases it fails the "independence
of clones" criteria, yet this perceived disadvantage does not yield
unfair results if all aspects of VoteFair Ranking are used.
Finally, just in case it is not obvious, of course Condorcet-Kemeny
calculations do require non-polynomial time if all the sequence scores
are calculated. Yet, as I have said before, it is not necessary to
calculate every sequence score in order to identify the sequence with
the highest score.
Richard Fobes
On 9/19/2011 2:20 PM, Warren Smith wrote:
> To reiterate and/or answer some questions:
>
> 1. there is no way to find the Kemeny winner that is much faster than
> finding the full Kemeny ordering. More precisely, both are NP-complete tasks
> so there is no poly-time algorithm for either unless P=NP.
>
> 2. Even if some benificent God helpfully informs you of the name of
> the Kemeny winner,
> or the full Kemeny ordering, then there is no easy way for you
> to confirm or deny that claim. No matter how much supporting information
> God provides to you (if it is only a polynomially-bounded number of
> bits) -- such
> as a proof -- there remains no way to confirm or deny either claim that runs in
> polynomial time, unless NP=coNP. In other words, no short proofs exist.
>
> 3. It IS possible to find both the Kemeny
> ordering and Kemeny winner, in any election, if you are willing to
> devote enough compute time to it. But the amount of time needed will
> exceed any polynomial in the #candidates. Every currently known
> algorithm in the papers I cited fails for easy-to-generate and fairly natural
> 40-candidate elections, no matter how much time they devote to it
> within the limits of their finances.
>
> 4. The hardest elections have got Smith set = the full set of candidates.
> This is asymptotically not a great restriction since random elections have
> Smith set = full set, with probability -->1 in the limit as
> #candidates --> infinity.
>
> 5. I posed as a challenge, a certain randomized class of 27-candidate elections
> which I designed to be hard.
> It seemed plausible to me that these 27-candidate elections might be too hard
> for current algorithms to reliably provably find the Kemeny winner.
> I of course do not know the winners of my challenge elections, since if I did,
> then a short proof would exist, which it cannot in maximally hard elections.
> So far, nobody has convinced me they are able to answer the Kemeny challenge,
> but Kristofer Munsterhjelm has preliminarily claimed in email that
> he's been able to solve
> one election sampled from my class -- which causes
> him to be optimistic that he can answer the challenge.
> We'll see -- if that is true, I don't see why K.M. hasn't completed
> the job, since
> it seems like if he's done what he thinks he has, then he's
> done 99% of the programming work
> and the remaining 1% is for some reason holding him up.
> In the event KM or somebody else successfully
> answers the challenge, then I'll be reasonably impressed,
> but then I probably can create harder challenges
> with greater numbers of candidates -- the goal is to
> try to identify the borderline between "hard" and "easy."
>
> 6. Richard Fobes has told me and the election methods group in emails
> that he's got various
> wonderful answers to all these issues, which he sadly has never found
> the time to
> explain, including in his privately published book on the subject. I
> (a) just don't believe him.
> (b) some of his "great solutions" seem to be now to recommend a
> hybrid of about 3 different
> election methods -- which he has nowhere fully defined. This hybrid
> method no longer can
> claim either simplicity or any attractive package of nice logical
> properties, but it should at least
> make it always feasible to compute the winner..
>
More information about the Election-Methods
mailing list