Dear all,<div><br></div><div>a small correction to my email below (a negation was forgotten):</div><div>I wrote: "<span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px; background-color: rgb(255, 255, 255); ">Yes, it might happen, that a good variable is eliminated, which significantly improve the analysis, but this is very likely, since you normally have so many other good variables."</span></div>

<div><font class="Apple-style-span" face="arial, sans-serif">The text should read: "<span class="Apple-style-span" style="font-size: 13px; background-color: rgb(255, 255, 255); ">Yes, it might happen, that a good variable is eliminated, which significantly improve the analysis, but this is NOT very likely, since you normally have so many other good variables."</span></font></div>

<div><font class="Apple-style-span" face="arial, sans-serif"><br></font></div><div><font class="Apple-style-span" face="arial, sans-serif">My appologies and thanks for your understanding or at least patience.</font></div>

<div><font class="Apple-style-span" face="arial, sans-serif"><br>Best regards</font></div><div><font class="Apple-style-span" face="arial, sans-serif">Peter Zborník<br></font><br><div class="gmail_quote">On Sat, Sep 24, 2011 at 5:52 PM, Peter Zbornik <span dir="ltr"><<a href="mailto:pzbornik@gmail.com">pzbornik@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Dear all,<div><br></div><div>I guess there could be some simple elimination of candidates before the election, so that there will be a manageable set of candidates for the Kemeny election, like 10 to 15.</div>

<div><br></div>

<div>I guess sometime the winner would be lost in the reduction, but I would expect this to be extremely rare if just the elimination would be efficient.</div><div><br></div><div>Variable elimination is standard in multivariate data analysis with one response variable (the thing you like to predict) to reduce the number of variables, which initially can be several hundreds to something more manageable, like 40 variables. A common way to reduce the number of variables is just to do a single variable analysis with the response variable. The variables with a significance below a threshold are eliminated. Yes, it might happen, that a good variable is eliminated, which significantly improve the analysis, but this is very likely, since you normally have so many other good variables.</div>


<div><br></div><div>The same thing applies for single-winner elections. There is no need to be afraid pr ashamed to eliminate some of the candidates using some rule of thumb in order to get a manageable number of them.</div>


<div><br></div><div>So the question I would like to pose would be: Which rule should be used to reduce the number of candidates in a Kemeny election to K in the unlikely case we have a Smith set of candidates larger than K, i.e. a very even election? What should be the value of K?</div>


<div><br></div><div>Here are my favorites (all rules assume that the candidates outside the Smith set are eliminated)</div><div>1) There will be a candidate threshold for first preferences, just like in some parliamentary election, say 5%</div>


<div>2) The K candidates with the highest first preferences will be retained</div><div>3) Perform a K-seat STV election</div><div>4) eliminate the candidates according to least preferences until the method finds a winner within X minutes using a fast computer</div>


<div><br></div><div><div>Let's face it, you can crash any method, unless you have extra rules, and in the real world there are a lot of them and few object to them.</div></div><div><br></div><div>What if every citizen of the USA would decide to run for president?</div>


<div>That would be a 300 million candidate election (gasp!).</div><div>Now those ballots would be really long, which might pose a problem if you have paper ballots.</div><div>Ok, so in order to prevent that, I guess there are some rules in the US that decrease the number of candidates (number of signatures, money, party support and so on).</div>


<div><br></div><div>Best regards<br>Peter Zborník</div><div><div></div><div class="h5"><div><br></div><div><br><br><div class="gmail_quote">On Sat, Sep 24, 2011 at 4:50 PM, Kristofer Munsterhjelm <span dir="ltr"><<a href="mailto:km_elmet@lavabit.com" target="_blank">km_elmet@lavabit.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Warren Smith wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

To reiterate and/or answer some questions:<br>

<br>

1. there is no way to find the Kemeny winner that is much faster than<br>

finding the full Kemeny ordering. More precisely, both are NP-complete tasks<br>

so there is no poly-time algorithm for either unless P=NP.<br>

<br>

2. Even if some benificent God helpfully informs you of the name of<br>

the Kemeny winner,<br>

or the full Kemeny ordering, then there is no easy way for you<br>

to confirm or deny that claim.   No matter how much supporting information<br>

God provides to you (if it is only a polynomially-bounded number of<br>

bits) -- such<br>

as a proof -- there remains no way to confirm or deny either claim that runs in<br>

polynomial time, unless NP=coNP.  In other words, no short proofs exist.<br>

<br>

3. It IS possible to find both the Kemeny<br>

ordering and Kemeny winner, in any election, if you are willing to<br>

devote enough compute time to it.  But the amount of time needed will<br>

exceed any polynomial in the #candidates.  Every currently known<br>

algorithm in the papers I cited fails for easy-to-generate and fairly natural<br>

40-candidate elections, no matter how much time they devote to it<br>

within the limits of their finances.<br>

<br>

4. The hardest elections have got Smith set = the full set of candidates.<br>

This is asymptotically not a great restriction since random elections have<br>

Smith set = full set, with probability -->1 in the limit as<br>

#candidates --> infinity.<br>

</blockquote>

<br></div>

Is Kemeny independent of Smith-dominated alternatives? Toby said he thought he was. If it is, then Kemeny is feasible for practical elections because you can just restrict it to the Smith set (and the Smith set won't be very large in practice). That is, of course, unless the existence of advanced voting methods will make the public vote in ways that will lead to a greater Smith set (e.g. candidates who otherwise didn't want to split the vote now feel bold enough to show up). Still, even so, I can't imagine there would be a 27-member Smith set.<div>


<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

5. I posed as a challenge, a certain randomized class of 27-candidate elections<br>

which I designed to be hard.<br>

It seemed plausible to me that these 27-candidate elections might be too hard<br>

for current algorithms to reliably provably find the Kemeny winner.<br>

I of course do not know the winners of my challenge elections, since if I did,<br>

then a short proof would exist, which it cannot in maximally hard elections.<br>

So far, nobody has convinced me they are able to answer the Kemeny challenge,<br>

but Kristofer Munsterhjelm has preliminarily claimed in email that<br>

he's been able to solve<br>

one election sampled from my class -- which causes<br>

him to be optimistic that he can answer the challenge.<br>

We'll see -- if that is true, I don't see why K.M. hasn't completed<br>

the job, since<br>

it seems like if he's done what he thinks he has, then he's<br>

done 99% of the programming work<br>

and the remaining 1% is for some reason holding him up.<br>

</blockquote>

<br></div>

The thing holding me up is called employment. As you may have noticed, I've been less active on the list as of late -- same reason: work is taking up my time and effort.<div><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

In the event KM or somebody else successfully<br>

answers the challenge, then I'll be reasonably impressed,<br>

but then I probably can create harder challenges<br>

with greater numbers of candidates -- the goal is to<br>

try to identify the borderline between "hard" and "easy."<br>

</blockquote>

<br></div>

I think I solved a few instances of Warren's 27x27 matrix on [0...1 billion], but I can't be sure since I can't confirm that my Kemeny cost is indeed the least possible. I am, however, assuming that it is, and so I'm looking at larger sizes. Since these will be harder, I've thought of adding an initialization step where the program tells the IP solver of a reasonably low-cost guess. This could speed up the process, thus letting me solve greater size matrices in about the same time I currently solve 27x27. However, I have not actually implemented this initialization yet (see above re what's holding me up).<div>


<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

6. Richard Fobes has told me and the election methods group in emails<br>

that he's got various<br>

wonderful answers to all these issues, which he sadly has never found<br>

the time to<br>

explain, including in his privately published book on the subject.  I<br>

  (a) just don't believe him.<br>

  (b) some of his "great solutions" seem to be now to recommend a<br>

hybrid of about 3 different<br>

election methods -- which he has nowhere fully defined.   This hybrid<br>

method no longer can<br>

claim either simplicity or any attractive package of nice logical<br>

properties, but it should at least<br>

make it always feasible to compute the winner..<br>

</blockquote>

<br></div>

Kemeny would probably be best described as an optimization problem - that is, something like "the winning sequence is a transitive ordering for which the sum of voters who rank each candidate above the candidate just below it in the ordering is maximized". Then the actual process of optimization is an implementation detail, and may be as simple or complex as needed.<div>


<div></div><div><br>

<br>

----<br>

Election-Methods mailing list - see <a href="http://electorama.com/em" target="_blank">http://electorama.com/em</a> for list info<br>

</div></div></blockquote></div><br></div>

</div></div></blockquote></div><br></div>