[EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Fri Jan 10 03:30:47 PST 2020

On 06/01/2020 06.19, VoteFair wrote:
> On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote:
>> Do you agree with my point about the use of the term "mathematically
>> equivalent"? If the method appears to be equivalent, that might give the
>> impression that you don't need to care about cycles at all.
> 
> VoteFair popularity ranking IS mathematically equivalent to the
> Condorcet-Kemeny method. The difference is that John Kemeny described
> the version that counts disapprovals (and finds the smallest sequence
> score), whereas VoteFair popularity ranking counts approvals (and finds
> the largest sequence score).
> 
> The VoteFair Ranking software does calculate the full version (of
> VoteFair popularity ranking) to the extent that the
> "global_check_all_scores_choice_limit" value is as large as desired.
> 
> I choose to leave that value at 6, even though it could be set to 20 or
> higher with no execution errors.  Of course in that case it would be
> helpful to insert code that checks for time delays and shows progress in
> case a Condorcet cycle is slowing it down.

What you seem to be saying is that the objective function (what you're
trying to optimize), when taking into account the direction of the
optimization, is the same, so the problem is mathematically equivalent.

That's right. But that doesn't suffice to make the *solution function*
mathematically equivalent. If it did, then likewise any approximation to
e.g. Traveling Salesman would be mathematically equivalent with solving
Traveling Salesman itself, just because it counted the total distance
the salesman has to travel the same way.

VoteFair may *approximate* Kemeny, and it may use an equivalent scoring
function, but for any value of the constant, there exist an infinite
number of inputs for which the outputs will disagree.

Note that this isn't about "likely to disagree". Mathematically, if
there's a single difference between two functions' mapping, then the
functions are different. This holds regardless of the implementation
(because a mathematical mapping is agnostic to how it's implemented).
E.g. both Insertion and Merge sort implement the mapping from unsorted
lists to stably sorted ones, but the algorithms themselves are very
different.

> I'm not sure what you mean by:
>> ... that might give the
>> impression that you don't need to care about cycles at all.
> 
> Neither the method nor the software implies that cycles are of no
> importance.

Someone who reads "mathematically equivalent" as "proven to give the
same results" will think that there are no inputs for which the outputs
differ. Thus he may think that there's no need to be on the lookout for
strange behavior under certain conditions.

I seem to recall that around 2010, Warren thought that you were claiming
that VoteFair did exactly reproduce Kemeny results/optima (for every
election, i.e. not an approximation). There was then a long thread about
NP-hardness and whether or not determining the Kemeny winner is NP-hard
(which it is). I *think* you reached the conclusion that VoteFair was an
approximation, but I don't clearly remember.

The point is, Warren's a mathematician and got confused about what
"mathematically equivalent" meant as you used it. I'd imagine a
non-mathematician would have even less chance of getting it right.

And if he understands "mathematically equivalent" that way, and think
VoteFair always returns the Kemeny optimum, then he's not going to
think that "oops, I have to be careful about *these* elections", because
there's nothing to be careful about. After all, it's equivalent, so
shouldn't it return the correct outcome for every election? VoteFair
itself doesn't give any indication of when the results are suspect and
should be thrown out for a tie instead, either.

(I'd suggest replacing the insertion sort code with an integer
programming solver (like Math::LP::Solve or Math::GLPK) and setting a
timeout. Such a solver either returns the optimum or says "couldn't do
that in the time given". It's equivalent over the domain where it
returns a result, and outside of that domain, it clearly times out.

Or more generally, alter the code so that instead of returning a
possibly-right result always in finite time, always returning a right
result but sometimes taking a very long time about it. A timeout is
easier to diagnose than a result that looks right but isn't. But if
you're going to do that, why not use an integer programming solver?)