[EM] pairwise, fairness, and information content
Richard Moore
rmoore4 at cox.net
Wed Aug 14 19:12:45 PDT 2002
A couple of weeks ago, Craig Carey made an assertion that (to me, at
least) was astonishing. He said that a pairwise comparison of
candidates contains no information about the "right winner" (his
phrase) in an election. That statement should be examined closely.
If Craig's statement were true, then, in a two-candidate election,
pairwise comparing would give the wrong result as often as it gives
the right result. That would have far-reaching consequences in the
study of election methods.
The meaning of "right winner" is subjective, but I'll postulate that a
method capable of picking the "right winner" (if such a method can be
defined) must meet some basic requirements of fairness.
In my earlier response to Craig, I used the words "non-arbitrary" and
"non-dictatorial" to express, loosely, the concept of fairness, but
that was misleading on two counts. First, "non-arbitrary" might be
subject to different interpretations, just as "right winner" is.
Second, "non-dictatorial" doesn't change the truth value of my
statement one way or another, since the pairwise result and the result
of a dictatorship will have a weak positive correlation, if the result
is dictated by one of the ballots used to obtain the pairwise result.
(If the dictatorial result is not determined by one of the ballots --
if it is imposed from without -- then it is "arbitrary"). So I need
some fairness principles to replace the imprecise term,
"non-arbitrary". I submit the following three requirements for fairness:
1. Monotonicity -- Replacing a ballot that ranks X over Y with one
that ranks Y over X shall never cause the result to change from a win
for Y to a tie or a loss for Y, or from a tie to a loss for Y.
2. Pareto -- If everyone favors candidate X over candidate Y, then
candidate X shall win. A monotonic method that did not meet this
requirement would never elect candidate X.
3. Permuting the ballots shall not change the result. This eliminates
methods that arbitrarily weight ballots differently. With this
restriction, it is not necessary to represent the ballots as an
ordered set; it is sufficient to know the total number of ballots for
each classification.
I would also introduce a 4th requirement that a fair method be
unbiased, but it turns out not to be necessary for the proof that follows.
I will also postulate that "method 1 has information about method 2"
means that there is a non-zero correlation between the results of
method 1 and the results of method 2. For a two-candidate,
single-winner election, that means, if S is a multiset of ballots, and
if we define f1(S) to be the following:
f1(S) = +1 if method 1 selects A given S as its input,
0 if method 1 does not make a decision,
-1 if method 1 selects B given S as its input;
and if we define f2(S) in a similar way for method 2, then for any
positive integer C,
sum( f1(S)*f2(S) ) != 0
when the summation is done over all sets containing at least one, but
not more than C, ballots. The upper bound C makes consideration of
infinite sums unnecessary.
I am not defining a direct measure of the quantity or quality of
information found in method 1 about method 2; I am only defining a
test for the existence of such information.
There are three types of ballots: A>B, neutral, and B>A. Let ca(S) be
the number of A>B ballots in S, and cb(S) be the number of B>A ballots
in S. Let P(S) be the standard pairwise comparison that maps any such
set to a selection from the 2-candidate set { A, B }, and let p(S) be
defined as follows:
p(S) = -1 if ca(S) < cb(S) (i.e., if P(S) = B)
p(S) = 0 if ca(S) = cb(S) (i.e., if P(S) is indeterminate so that a tie-breaker
is needed)
p(S) = +1 if ca(S) > cb(S) (i.e., if P(S) = A)
Now let M be a method that selects the "right winner" from the
two-candidate set { A, B }, given the multiset of ballots S; hence by
my original postulate M meets the three requirements introduced
earlier. The funtion m(S) can then be defined as:
m(S) = -1 if M(S) = B
m(S) = +1 if M(S) = A
m(S) = 0 if M(S) does not provide a result
We need to determine if p(S) and m(S) are correlated.
Consider two complementary sets of ballots, S and S'. By
complementary, I mean that S' can be obtained by replacing every A>B
ballot in S with a B>A ballot and replacing every B>A ballot in S with
an A>B ballot. Then ca(S) = cb(S'), cb(S) = ca(S'), and p(S) = -p(S').
Each set S for which p(S) != 0 is a member of a unique complementary pair.
There are four cases:
1) m(S) = p(S) and m(S') = p(S')
2) m(S) = p(S) and m(S') != p(S')
3) m(S) != p(S) and m(S') = p(S')
4) m(S) != p(S) and m(S') != p(S')
If p(S) = 0, then p(S') = 0, and none of these combinations can
contribute to the correlation sum. So we only need to consider the
(S, S') pairs where p(S) != 0. For such pairs:
The first case contributes 0, +1, or +2 to the correlation sum,
depending on whether neither, one, or both of m(S) and m(S') are equal
to zero.
The second case does not affect the correlation sum, unless m(S') = 0,
in which case it contributes +1.
The third case does not affect the correlation sum, unless m(S) = 0,
in which case it contributes +1.
The fourth combination would contribute 0, -1, or -2 to the
correlation sum, depending on whether neither, one, or both of m(S)
and m(S') are zero. However, the fairness requirements only allow the
fourth case to occur when m(S) and m(S') are both zero, as shown below:
For any set S for which p(S) != 0, and its complement S', we have either
a) cb(S') = ca(S) > ca(S') = cb(S), therefore p(S) = +1
or
b) ca(S') = cb(S) > cb(S') = ca(S), therefore p(S) = -1
Due to the monotonicity of method M, we can assert for case (a) that
if M(S) = B or is indeterminate, then M(S') = B or is indeterminate,
since S' can be obtained from S by replacing some A>B ballots with B>A
ballots. Thus, if p(S) = +1, and m(S) != p(S) (implying that M(S) !=
A), and m(S') != p(S') (implying the M(S') != B), we can only meet the
given conditions if M(S) and M(S') are both indeterminate (i.e., if
m(S) = m(S') = 0).
A similar argument applies to case (b).
Therefore, each complementary pair of ballot sets must have a
non-negative contribution to the correlation sum.
The Pareto requirement means that, for set SCA which contains only C
A>B ballots and set SCB which contains only C B>A ballots:
M(SCA) = A
M(SCB) = B
Note that m(SCA)*p(SCA) = +1 and m(SCB)*p(SCB) = +1. Thus the
complementary pair of sets SCA and SCB will contribute +2 to the
correlation sum. Since no other complementary pair can contribute a
negative amount to the sum, the sum must be positive for all sets of C
ballots. Furthermore, if we add the sums for all sets with fewer than
C ballots, the sum will still be positive.
Hence, if M is a fair method for selecting one winner from two
candidates (i.e., for ranking the two candidates), then the standard
pairwise method P provides some information about the result
of M. It seems rather obvious, even remedial, but sometimes a proof is
called for.
This theorem says nothing about the quality of that information
(I'll have comments -- of a qualitative nature -- on pairwise
information quality in the future). It seems doubtful that methods
that correlate only weakly with P for two candidates could have much
practical value. My own belief is that the best method for picking a
single winner from two candidates (the question Craig did not want to
answer) is P plus a random tie-breaker.
-- Richard
----
For more information about this list (subscribe, unsubscribe, FAQ, etc),
please see http://www.eskimo.com/~robla/em
More information about the Election-Methods
mailing list