[EM] Verification of a voting outcome for VoteFair.

Sat Apr 21 13:54:28 PDT 2012

On 04/19/2012 03:56 AM, Richard Fobes wrote:
> Finally I have time to reply to Kristofer's comments about whether or
> not VoteFair popularity ranking is the same as the Condorcet-Kemeny method.
>
> If the Smith set does not contain more than six candidates, then yes
> VoteFair popularity ranking always identifies the same top-ranked
> candidate (the "winner") as the Condorcet-Kemeny method.
>
> This occurs even if there are fifty (or more) candidates because the
> VoteFair algorithm quickly moves to the top of the ranking the
> candidates who are in the Smith set. (I'll back up this claim when I
> finally have time to reply to Jameson's even-earlier message.)
>
> The VoteFair algorithm cross-checks the top six choices using the full
> sequence-score calculation method. For this reason, plus the
> above-explained reason, there cannot be a discrepancy in identifying the
> winner for cases that involve no more than six candidates within the
> Smith set.

Okay, so you're saying VoteFair gives the same winner as Kemeny when, in 
Jameson's terms, N <= 6, but not necessarily when N > 6. So VoteFair is 
distinct from Kemeny in the strict sense.

> If the Smith set contains more than six candidates, then the open-source
> algorithm for calculating VoteFair popularity ranking results could
> possibly identify a different "winner" compared to the Condorcet-Kemeny
> method. Yet this discrepancy can only occur if the voters are not clear
> in expressing any kind of consistency in their preferences, which means
> there is a high level of circular ambiguity. (I hope to have time to say
> more about this later.)
>
> In such cases (involving a discrepancy) the "correct" Condorcet-Kemeny
> winner would be highly controversial, and that winner could easily fail
> to win a runoff election (against the VoteFair-algorithm-ranked most
> popular candidate). Plus the results would be controversial regardless
> of which Condorcet method were used.

Granted, but I was talking about Kemeny in the mathematical sense. You 
have said (or I identified you as saying, as did Warren), that VoteFair 
*was* Kemeny.

> The Wikipedia article titled "Approximation algorithm" (at
> http://en.wikipedia.org/wiki/Approximation_algorithm) provides a name
> for the relationship between the VoteFair-popularity-ranking algorithm
> and the Condorcet-Kemeny method (for cases in which the Smith set
> exceeds six candidates), and this is the same relationship that exists
> between an algorithm that quickly and reasonably (but not _always_
> _exactly_) solves the Traveling Salesman Problem and the exactly
> calculated NP-hard Traveling Salesman Problem.
>
> This matches what I've said before, but gives a specific name for the
> relationship.

Yes. However, that raises another question.

Why VoteFair at all? If it is true that N <= 6 in all practical 
elections, then the proper, exact, integer programming formulation will 
always finish quickly. So we could just as easily use "proper" Kemeny. 
Proper Kemeny would be more elegant to mathematicians - no need to 
tinker with insertion sorts and pivoting. Furthermore, were by some 
unfortunate coincidence N to be greater than 6, proper Kemeny would 
satisfy all the criteria Kemeny does not just in an approximate manner, 
but in an exact one. In that manner, exact Kemeny scales better.
Furthermore, if the voters don't care about complexity underlying a 
method, they're not going to care about the complexity of the integer 
linear programming solver that would underlie the exact Kemeny method, 
either.

Or to put it another way: say you constrain yourself to N <= 6. Then 
both exact Kemeny and VoteFair will work. Whither the advantage of VoteFair?

In logistics problems like the TSP, we have two particular properties. 
First, "almost right" is nearly as good as "exactly right". Second, 
we're dealing with very large instances so that it becomes prohibitively 
expensive to do an exact calculation. If all practical voting situations 
have N <= 6, then the second doesn't hold. If, on the other hand, N > 6 
(and the voters insist on Kemeny), then the *first* doesn't hold.

> Kristofer implies that the well-studied characteristics of the
> Condorcet-Kemeny method cannot apply to the VoteFair popularity ranking
> method if there is ever any inconsistency in the results. This
> perspective is overly simplistic. Of course it is true that if the
> Condorcet-Kemeny method always produces results that meets a specified
> criteria, then we cannot be sure that the VoteFair popularity ranking
> calculations also meet the specified criteria if the case involves more
> than six candidates in the Smith set. Yet it is also true that the
> VoteFair popularity ranking calculations may meet the specified
> criteria. We don't know which situation applies until we check the results.

Not necessarily. One might devise proofs that state that *only* Kemeny 
can pass certain criteria. To quote Wikipedia:

"In 1978 Peyton Young and Arthur Levenglick showed[3] that [the Kemeny] 
method was the unique neutral method satisfying reinforcement and the 
Condorcet criterion (...)".

Reinforcement is the major criterion where Kemeny differs from the other 
advanced Condorcet methods. If only Kemeny can pass both Condorcet and 
reinforcement, then approximations of Kemeny won't. They might fail it 
even when N <= 6, as reinforcement concerns itself with full social 
orders, not just winners.

> On 4/7/2012 3:19 AM, Kristofer Munsterhjelm wrote:
>  >...
>  > B>C>D>A has score 44.
>  > C>D>B>A has score 44.
>  >
>  > As far as I understood your post, those are the only with score 44.
>  > VoteFair picks neither, nor does it give a direct tie between C and B.
>
> If an implementation of the Condorcet-Kemeny method were to choose one
> of these two sequences, then it would be ignoring the other sequence,
> and that would ignore the whole point of the Condorcet-Kemeny method,
> which is to find the sequence or sequences with the highest score. This
> method says that both sequences are equally valid as sequences that have
> the highest score, so both must be taken into account in order to be
> consistent with the Condorcet-Kemeny method.

But I could very well claim that ignoring either sequence itself ignores 
the Condorcet-Kemeny method. Consider an analogy to Condorcet and Smith. 
The Smith set criterion is a generalized case of Condorcet in the sense 
that it says: if there's a "tie" (cycle) among certain candidates, then 
the winner should be one of the candidates in the set. Similarly, I 
could argue that in the case of a tie for Kemeny score, a method that 
passes Kemeny should pick one of the tied social orders. Otherwise 
someone could complain that the method gave a social ordering of score 
43 when there clearly is some social ordering of score 44 that it could 
have picked without harming anyone.

I mentioned ties because that might give an out. If you said that 
"B>something and C>something are tied, thus I have to pick something 
that puts both B and C at first place to do justice to both parallel 
worlds", that might work - particularly since the definition of Kemeny 
scores gives room for interpretation when one considers social orders 
with ties.

However, VoteFair doesn't give a social ordering with a tie for first. 
Instead, it picks an ordering with a score of 43. Now, VoteFair may have 
a reason for doing so, but that reason isn't Kemeny score optimization.

> Of course if these two sequences had choices B and C in the first two
> positions (such as B>C>D>A and C>B>D>A) then of course choices B and C
> are tied. But that is not the case.
>
> Instead, choice B jumps from first place in the first sequence down to
> third place in the second sequence, whereas choice C moves from second
> place to first place.
>
> The "official" Condorcet-Kemeny method (including Kemeny's original
> description) does not specify how to resolve this situation.

The integer linear program goes like this:

1. minimize (sum for all candidates i,j) w_(i,j) * x_(i,j)

subject to

2. for all distinct candidates a, b: x(a,b) + x(b,a) = 1
3. for all distinct candidates a, b, c: x(a,b) + x(b,c) + x(c,a) >= 1
4. for all distinct candidates a, b: x(a,b) is integer.

where w(i,j) is the strength of i's defeat over j according to the 
transformed (wv, margins, etc) Condorcet matrix, and x(a,b) is 1 if a is 
ranked lower than b, 0 otherwise. The constraint designated 2. 
disqualifies all solutions where some pair of candidates mutually beat 
each other, while 3. disqualifies solutions where the triangle 
inequality doesn't hold (i.e. the social ordering has cycles).

 From the ILP, we can see that it doesn't make any decision about which 
score-44 ordering it would pick. In that respect, there is a tie. 
However, the minimization strictly constrains the method to pick one of 
the score-44 orderings, something that VoteFair does not do.

The paper in which this formulation is given also defines Kemeny as follows:

"For any two candidates a and b, given a ranking r and a vote v, let 
delta_a,b(r,v) = 1 if r and v agree on the relative ranking of a and b 
(they either both rank a higher or both rank b higher) and 0 if they 
disagree. Let the /agreement/ of a ranking r with a vote v be given by 
(sum over all candidates a,b) delta_a,b(r, v), the total number of 
pairwise agreements. A /Kemeny ranking r/ maximizes the sum of 
agreements with the votes, [sum over v] [sum over a,b] [delta_a_b(r, v)]".

I.e. a Kemeny social ordering is one which maximizes the sum of 
agreements. The paper does not explicitly say that the Kemeny method 
should pick a Kemeny ordering, but that seems evident; and if it is, the 
only difference between a tied scenario and a non-tied scenario is how 
many Kemeny orderings are available for the method to choose from.

> The open-source VoteFair ranking software resolves this situation by
> calculating the average ranking of each choice over all the sequences
> that have the same highest score. In this case, if we number the
> positions starting at 1 for the beginning of the sequence, the average
> ranking for choice C is 1.5, the average ranking for choice B is 2.0,
> the average for choice D is 2.5, and the average for choice A is 4.0.
> This puts choice C (at 1.5) above choice B (at 2.0).

So VoteFair runs a virtual Borda election among all Kemeny-optimal 
orders? That is one way to do it, but it isn't Kemeny.

> Simplicity for the person who writes the software is a tiny issue
> compared to simplicity for the voters (in terms of ballot type and
> marking strategies), and compared to simplicity in terms of
> understanding the algorithm (which is a big challenge for the
> Condorcet-Schulze method), and compared to the issue of voters trusting
> the results (which relates to mathematical arguments being difficult for
> most people to understand).

Perhaps mathematical arguments are difficult for most people to 
understand, but I don't think that mechanism-centric arguments are all 
that simple, either. In VoteFair, for instance, you have the presort, 
the actual sort, and then quite a number of post-sort fixes to handle 
ambiguities. While it's not as directly impenetrable as, say, an 
eigenvector-based Condorcet method, that would still leave a lot of 
complexity.

VoteFair seems to be neither the best choice if the voters are concerned 
about mathematical elegance (in which case Schulze is simpler) or if 
they're concerned about transparency (because many things could hide in 
the ambiguity resolution part or the choice of sorting algorithms, and 
because other methods, like Ranked Pairs, are simpler). So there must be 
some other reason to prefer VoteFair.

If I recall correctly, Schulze says something like: A indirectly beats B 
by a magnitude of x if A either directly beats B by at least magnitude 
X, or A beats someone who indirectly beats B by at least magnitude X. A 
indirectly beats B (period) if A indirectly beats B by a greater 
magnitude than B indirectly beats A. The candidate who indirectly beats 
most of the other candidates wins.

Ranked Pairs is less mathematical: sort defeats in order of magnitude. 
Go down the list, making your social ordering consistent with the defeat 
unless that would contradict something you affirmed earlier.

What is VoteFair's advantage in that respect? It's not as mathematically 
simple (being Kemeny-but-not-just) as Schulze and it's not as 
mechanism-design simple as Ranked Pairs/MAM; and both of those methods 
are among the advanced Condorcet methods. (In addition, though I didn't 
mention that above, both Ranked Pairs/MAM and Schulze are cloneproof, 
but since Kemeny isn't, neither is VoteFair.)

> As for the Condorcet-Schulze method being better known, that's because
> software for it was available years ago, which relates to the concept
> that it is easy to program (except for dealing with ties, which
> complicates all methods). History is filled with examples of the
> first-available choice not surviving over time. As one example, CPM and
> MS-DOS came before MS-Windows (and that race isn't over yet).

In politics, however, precedence usually carries a very heavy weight. If 
it didn't, we wouldn't be saddled with Plurality in the first place. How 
can VoteFair counter that preference towards that which is already 
established? What "features", so to speak, does VoteFair have that 
Schulze doesn't (or Ranked Pairs doesn't)?

> I regard VoteFair ranking as having advantages that are not yet
> appreciated. Remember that the voting characteristics listed in the
> Wikipedia "Voting system" comparison chart are just checklist-like
> yes-versus-no attributes that fail to reveal how often, and under what
> circumstances, each method fails each of the failed criteria. (I am not
> disputing the importance of those criteria; I am saying that numeric
> information can be more revealing than true/false information.)

Sure. Yet one should be careful not to walk into the trap of "oh, my 
method fails X but it's okay, it doesn't count". Down that road lie long 
arguments and counters; but if one knows a method passes a criterion, 
then one knows the method always does so and that there's no wiggle room 
for it.

If you have numbers, go ahead and show them, as well as the arguments 
for their importance.

>  > On the other hand, if the max Smith set size is high, then VoteFair may
>  > not approximate Kemeny well enough. In that case, if what you want is
>  > Kemeny, then you pretty much have to go to Kemeny.
>
> If someone wants exact Condorcet-Kemeny results for a large Smith set
> then I have to wonder why. If it's needed to simulate results for
> studying its mathematical characteristics, then of course that's
> different from a group of voters wanting to know which candidate
> deserves to win an election.

And if it's a small Smith set, then you can use a constraint program 
which will cover the eventuality that the Smith set isn't small.

>  > The fine-tuning argument then is: it appears that for VoteFair to have a
>  > substantial advantage over other Condorcet methods, the max Smith set
>  > size for realistic elections have to be high enough that the other
>  > methods don't approximate Kemeny but simultaneously low enough that
>  > VoteFair does approximate Kemeny. Is that the case? It doesn't seem
>  > clear *as such*.
>
> I, and the people who use my software at VoteFair.org, have been getting
> superbly fair results, and I have not encountered any case in which one
> of the other Condorcet methods would be a better choice.

So have Andrew Myers of CIVS, to my knowledge, and he gives the results 
for not just one Condorcet method, but three and a half.

The advanced Condorcet methods are probably all "good enough". Thus 
different concerns come into play, like precedence or simplicity.

> Personally my view is that there is excessive focus on single-winner
> voting methods (partly because they are easier to study), yet there are
> bigger frontiers waiting to be pioneered. That's why I've gone beyond
> VoteFair popularity ranking to develop VoteFair representation ranking
> (because the "second-most popular" choice is not the same as the
> "second-most representative" choice), and VoteFair partial-proportional
> ranking (which provides a PR method that is designed for the situation
> in the U.S.), and VoteFair party ranking (which deals with the problem
> that will arise when vote splitting is not around to limit the number of
> candidates on the ballot, and which happens to minimize the cloneproof
> failure ascribed to the Condorcet-Kemeny method), and VoteFair
> negotiation ranking (which allows a Parliament to elect a fair slate of
> Cabinet Ministers without using any quota rules to enforce fairness).

I tend to agree. I have made some multiwinner methods myself, and most 
of the parliamentary European countries have little need for 
single-winner methods - except possibly on local mayoral levels. I would 
say, though, that I suspect the kind of optimization that you need to do 
in order to get really good PR according to voters' preferences implies 
the method can't be polytime in the worst case.

(Perhaps you'd be interested in my proportionality/majoritarian regret 
tradeoff picture and data: 
http://munsterhjelm.no/km/elections/multiwinner_tradeoffs/ )