[EM] Kemeny update

Sat Sep 24 10:56:19 PDT 2011

On Sat, Sep 24, 2011 at 10:50 AM, Kristofer Munsterhjelm
<km_elmet at lavabit.com> wrote:
> Warren Smith wrote:
>>
>> To reiterate and/or answer some questions:
>>
>> 1. there is no way to find the Kemeny winner that is much faster than
>> finding the full Kemeny ordering. More precisely, both are NP-complete
>> tasks
>> so there is no poly-time algorithm for either unless P=NP.
>>
>> 2. Even if some benificent God helpfully informs you of the name of
>> the Kemeny winner,
>> or the full Kemeny ordering, then there is no easy way for you
>> to confirm or deny that claim.   No matter how much supporting information
>> God provides to you (if it is only a polynomially-bounded number of
>> bits) -- such
>> as a proof -- there remains no way to confirm or deny either claim that
>> runs in
>> polynomial time, unless NP=coNP.  In other words, no short proofs exist.
>>
>> 3. It IS possible to find both the Kemeny
>> ordering and Kemeny winner, in any election, if you are willing to
>> devote enough compute time to it.  But the amount of time needed will
>> exceed any polynomial in the #candidates.  Every currently known
>> algorithm in the papers I cited fails for easy-to-generate and fairly
>> natural
>> 40-candidate elections, no matter how much time they devote to it
>> within the limits of their finances.
>>
>> 4. The hardest elections have got Smith set = the full set of candidates.
>> This is asymptotically not a great restriction since random elections have
>> Smith set = full set, with probability -->1 in the limit as
>> #candidates --> infinity.
>
> Is Kemeny independent of Smith-dominated alternatives? Toby said he thought
> he was. If it is, then Kemeny is feasible for practical elections because
> you can just restrict it to the Smith set (and the Smith set won't be very
> large in practice). That is, of course, unless the existence of advanced
> voting methods will make the public vote in ways that will lead to a greater
> Smith set (e.g. candidates who otherwise didn't want to split the vote now
> feel bold enough to show up). Still, even so, I can't imagine there would be
> a 27-member Smith set.

--seems to me, KM is correct; if the Smith Set has less than about 20
members (and if this also is true recursively upon removing it), then
it always will be feasible to find the Kemeny order.

How often will a (>=20)-member Smith set arise in practice?

I do not know.   KM "can't imagine" it will happen, but that's his guess.
I think if you take a 100-candidate election with random-uniform
votes, it'd happen pretty often. Such elections have Condorcet winners
about 9-12% of the time (says my computer), which leads me to guess
that they have (>=20)-member Smith sets about 10% of the time.  See
also puzzles
25 and 26 here for some theorems about the closely related problem of
random "round robin tournaments" rather than random "elections":
   http://www.rangevoting.org/PuzzlePage.html

But it could be argued/hoped that 100-candidate elections, when they
occur, will have vote distributions quite dissimilar to
random-uniform.
It could be counterargued that humanity's experience with
100-candidate elections is small, and with 100-candidate
rank-order-ballot
elections is even smaller, so there is little basis for KM's guess.

Tideman in his book has got a statistical semiempirical model, based
on real election statistics, of rank-order ballot elections.  His
model is described/corrected/recalculated here:
    http://rangevoting.org/TidemanElModel.html
and it finds the probability, in a 100-canddt election,that a
Condorcet Winner exists, is 27%, and for a 200-canddt
Tideman-statistics election this probability is 5.6%.

You also may enjoy the following easy exercise about the random
tournament model.  Consider N=A+B
candidates.  Suppose for any pair of candidates, which one wins
pairwise, is determined by a coin flip.
[(N-1)*N/2 independent coin flips in all.]
QUESTION: what is the probability P that an A-member Smith set exists,
i.e that there exists some subset of A of the N candidates, so that these A
pairwise-defeat each of the remaining B candidates?
ANSWER:
   P = binomial(N,A) * 2^(-A*B)
QUESTION: what is the probability that ANY nontrivial Smith set exists?
I.e. what is the chance that there exists ANY nonempty nonfull subset
of the N candidates, whose members pairwise defeat all nonmembers?
ANSWER (arises since summing the above answer from A=1 to N-1 using
B=N-A yields an upper bound on this probability...):
    Prob(Nontrivial Smith Set) --> 0   when N--> infinity
In particular for N=20 candidates we find
    Prob(Nontrivial Smith Set) < 0.0000763
and for N=100 candidates we find
     Prob(Nontrivial Smith set) < 3.2 * 10^(-28).

The random-votes election model is harder to calculate with than the
random tournament model but it usually has the same qualitative
behavior.

> I think I solved a few instances of Warren's 27x27 matrix on [0...1
> billion], but I can't be sure since I can't confirm that my Kemeny cost is
> indeed the least possible. I am, however, assuming that it is, and so I'm
> looking at larger sizes. Since these will be harder, I've thought of adding
> an initialization step where the program tells the IP solver of a reasonably
> low-cost guess. This could speed up the process,

--certainly it could.  If you use some nonrigorous heuristic to find a
good order, then you could then use the fact that the optimal order
must be at least that good, to help "prune" a subsequent exhaustive
search.
Lower bounds (i.e. in the other direction) can also be used for pruning
(if you have any).

If you have an integer program it can sometimes happen that the optimal
solution of the linear program (without constraint of having to have integers)
happens by luck to be all-integer.  In such a case, that proves it is optimal
for the integer program too.   However, in general integer programming
is NP-hard and such luck cannot be guaranteed.

Sometimes you can get partial such luck, e.g. you exhaustively try all
integer possibilities for the first two variables, and then you get
lucky as above about all the remaining variables in each case.   That
also would
find the optimum and yield an optimality proof.

If that is what happened for KM, then we could ask: what is the
"success rate," i.e. what percentage of the time do his integer
programs enjoy these
kinds of luck?

-- 
Warren D. Smith
http://RangeVoting.org  <-- add your endorsement (by clicking
"endorse" as 1st step)