[EM] Kemeny update

Sat Sep 24 14:12:34 PDT 2011

Peter Zbornik wrote:
> Dear all,
> 
> I guess there could be some simple elimination of candidates before the 
> election, so that there will be a manageable set of candidates for the 
> Kemeny election, like 10 to 15.
> 
> I guess sometime the winner would be lost in the reduction, but I would 
> expect this to be extremely rare if just the elimination would be efficient.
> 
> Variable elimination is standard in multivariate data analysis with one 
> response variable (the thing you like to predict) to reduce the number 
> of variables, which initially can be several hundreds to something more 
> manageable, like 40 variables. A common way to reduce the number of 
> variables is just to do a single variable analysis with the response 
> variable. The variables with a significance below a threshold are 
> eliminated. Yes, it might happen, that a good variable is eliminated, 
> which significantly improve the analysis, but this is very likely, since 
> you normally have so many other good variables.
> 
> The same thing applies for single-winner elections. There is no need to 
> be afraid pr ashamed to eliminate some of the candidates using some rule 
> of thumb in order to get a manageable number of them.
> 
> So the question I would like to pose would be: Which rule should be used 
> to reduce the number of candidates in a Kemeny election to K in the 
> unlikely case we have a Smith set of candidates larger than K, i.e. a 
> very even election? What should be the value of K?

Kemeny's strength is its relative simplicity compared with passing many 
properties (though unfortunately not clone independence). Thus I think 
that any patch to it should be simple as well, or one might as well just 
use Schulze or Tideman.

(Some have claimed that Kemeny's NP-hardness is a strength: although it 
means that some elections can take a long time to calculate, it also 
makes it similarly hard to calculate a strategy. I don't think this is 
really as strong a property as those claim, because strategies don't 
have to be exact. I do know Google does (or did) use "local 
Kemenization" - sorting - to make it harder to game their index, though.)

So, if simplicity is valued, I suggest something like this:

- Fix n = 27 (or the number of candidates you're going to permit).
- Calculate the Copeland ordering (which may have a lot of ties).
- Retain the top n candidates and pass them to Kemeny.
- The winner of this second stage is the winner of the entire election.

Copeland is a quite simple rule in itself, it is Smith, and since it's 
only used to remove candidates, its propensity towards ties shouldn't be 
a problem.