[EM] Fw: IBCM, Tideman, Schulze - Reply Part 1

Thu Jul 6 07:47:42 PDT 2000

I've been having trouble getting this message to post to the EM list.  A
short test message I sent a couple of days ago did seem to make it through,
so I thought there might be a message length limit that was causing my
posting to be dropped.  Therefore, I'm sending this again in TWO parts, in
the hope that this resolves the problem.

Norm Petry

***********

Hi--

I was looking at the last couple of exchanges between Mike and Markus and
noticed that Mike was asking questions which I thought I'd answered a few
days ago in a message to the EM list on 25JN00.  Puzzled, I decided to
double-check the e-groups archive and discovered that my message appears to
have gone astray; perhaps it was rejected because my ISP's domain name has
changed recently.  Anyway, I resubscribed to the list and am sending it
again.  Hopefully it will answer a few outstanding questions.

Norm

p.s. There is at least one obvious error in this message (probably many
:-) -- I wrote that SD and Schulze were the same without any pairties, but
obviously this is false, as Markus just showed.  I should in fact have
remembered that this was the case, since Mike gave a similar, 8-candidate
example in '98 which showed that SD was not completely clone independent,
and incidentally could produce a different result than Schulze without any
ties.  Rather than edit, I'll just send it as is...

***********

Steve,

I apologise for the long delay in replying to your very substantial posts
from a few weeks ago (I've been very busy with weddings and such, and
haven't had time to devote much attention to issues on EM).  I've
interspersed my reply with your comments, below:

N.

>-----Original Message-----
>From: Steve Eppley <SEppley at alumni.caltech.edu>
>To: election-methods-list at eskimo.com <election-methods-list at eskimo.com>
>Date: June 2, 2000 6:14 PM
>Subject: Re: [EM] Fw: IBCM, Tideman, Schulze
>
>
>Norm P wrote, in reply to a private question from Markus S:

[...]

>> simulations I've run which compare these methods.  However, it
>> is an open question whether it is possible to create this or
>> other similar examples using actual ballots.
>
>[Sorry about the delay in my response to Norm's message.]
>
>It's not really an open question.

Not anymore :)  What I meant was that I hadn't confirmed that the example I
gave Markus could have been generated from a set of ballot papers.  My
'rankPairwiseMethods' simulation starts with randomly generated 'Smith
Matrices' rather than ballots, and some of these will be 'impossible', in
that sense.  Thanks for providing an example based on ballots.

[...]

>> To test this claim I wrote my own simulation (called
>> 'rankPairwiseMethods' in VoteSim) which generates a series of
>> 'Smith Matrices' of various sizes, and compares how often
>> each method's winner pairwise-beats the winner chosen by every
>> other method.  Since I wanted to apply this test to a large
>> number of methods, I began by creating a pairwise matrix for
>> the methods themselves.  For each trial, I computed each
>> method's winner; if a method's winner beat another method's
>> winner pairwise, it would get one 'vote' against the competing
>> method.
>
>I'm curious about what Norm's software does when one or more of
>the methods are not decisive.  The stat that I implemented does
>not count scenarios in which either of the two methods being
>compared is indecisive.
>

In my 'rankPairwiseMethods' simulation, *all* the compared methods
decisively choose a single winner.  A tiebreaker is used to pick the winner
in the event that the core of the method is indecisive.  This is done by
creating what Tideman calls a 'TBRC' (tie-breaking ranking of candidates)
that is supplied to every single-winner routine.  A TBRC unambiguously
orders all of the candidates based on some other method.  Normally, Random
Ballot//Random is used to construct the TBRC (although any method can be
used -- a TBRC for Anderson's 'Regular Champion' would be created from
Plurality//Random to tiebreak Copeland, for example).

Because this particular test uses 'Smith Matrices' rather than ballots for
its raw data, I could not use a Random Ballot TBRC.  Therefore, *all* of the
methods in this particular test (including Copeland) used a simple Random
TBRC.  Note however, that the *same* TBRC was used for all the methods being
compared for each trial, so that, for example if the core parts of two
methods both pick the same *set* of winners, they would also yield the same
*single* winner after the tiebreaker is applied.  You may wish to look at
the 'rankPairwiseMethods' routine included with the version of VoteSim I
distributed to list members a while ago for details.

[...]

>
>> Once the simulation was complete, I determined a social
>> ranking of all the methods using Tideman's method (somewhat
>> arbitrary, but the result would be similar using any other
>> approach).  My results for 3-10 candidate Smith sets are as
>> follows (just for laughs I included Borda!):
>
>Since Borda and the PC(..) methods look outside the Smith set,
>perhaps those shouldn't be tested using a "Smith matrix."
>

Yes, I was aware of this.  Some of the 'av' (all-votes) variants of these
methods will also sometimes select winners outside the Smith set, so
strictly speaking, this test is assessing the performance of Smith//Borda,
Smith//PC, etc.

>> Trials:     20,000
>> Voters:     100
>>
>> Rank\Cands: 3             4             5             6
>>      1      Schulze(m)    IBCM(av)      Copeland      Copeland
>>      2      IBCM(m)       IBCM(wv)      IBCM(av)      IBCM(av)
>>      3      Schulze(wv)   IBCM(m)       IBCM(wv)      IBCM(m)
>>      4      IBCM(wv)      Tideman(wv)   IBCM(m)       IBCM(wv)
>>      5      IBCM(av)      Tideman(av)   Tideman(m)    Tideman(m)
>>      6      Schulze(av)   Tideman(m)    Tideman(wv)   Tideman(wv)
>>      7      PC(m)         Copeland      Tideman(av)   Tideman(av)
>>      8      Tideman(m)    Schulze(av)   Schulze(av)   Schulze(m)
>>      9      Goldfish(m)   Schulze(wv)   Schulze(wv)   Schulze(wv)
>>      10     SD(m)         Schulze(m)    Schulze(m)    Schulze(av)
>>      11     SSD(m)        SSD(wv)       SD(wv)        SD(m)
>>      12     PC(av)        SSD(av)       SSD(wv)       SSD(m)
>>      13     Tideman(wv)   SD(av)        SSD(av)       Goldfish(m)
>>      14     PC(wv)        Goldfish(wv)  Goldfish(wv)  SD(wv)
>>      15     Goldfish(wv)  Goldfish(av)  SD(av)        SSD(wv)
>>      16     SD(wv)        SD(wv)        Goldfish(av)  SSD(av)
>>      17     SSD(wv)       SD(m)         SSD(m)        Goldfish(wv)
>>      18     SSD(av)       Goldfish(m)   SD(m)         Goldfish(av)
>>      19     Tideman(av)   SSD(m)        Goldfish(m)   SD(av)
>>      20     Goldfish(av)  PC(av)        PC(m)         PC(m)
>>      21     SD(av)        PC(wv)        PC(av)        PC(wv)
>>      22     Copeland      PC(m)         PC(wv)        PC(av)
>>      23     Borda         Borda         Borda         Borda
>>
>> Rank\Cands: 7             8             9             10
>-snip-
>> Comparing Schulze(wv), Tideman(wv) and IBCM(wv), we see that
>> Steve's conclusion is correct for Smith sets of 4-10
>> candidates.  Note though that this result does not hold for
>> 3-candidate Smith sets, where Schulze produces better results
>> than either IBCM or Tideman.
>
>I think Norm has erred significantly in the 3-candidate case.
>As Markus noted last year, given 3 candidates, Schulze and
>PC(wv) choose the same.  Yet we can see that Norm's 3-candidate
>column ranks PC(wv), MTM, and IBCM below Schulze.
>

A number of the methods in the 3-candidate case are probably tied, but tied
methods are simply shown sequentially in my results, which is admittedly
somewhat misleading (see below for further explanation).

>And it is easy to show that, given 3 candidates, PC(wv) and BCM
>choose the same, and that the IBCM winner never loses to the
>PC(wv) winner, or to the Schulze winner, or to the MTM winner.
>Furthermore, given 3 candidates, PC(wv) and MTM choose the same.
>
>So Norm's implementation must be flawed.  Schulze, MTM, and
>PC(wv) must tie in the 3-candidate stat.  IBCM either ties them
>or leads them in the 3-candidate stat, depending on whether
>indecisive scenarios are counted or neglected.
>
>The way I've implemented the head-to-head comparison test, it
>only counts scenarios where the two methods being compared are
>decisive, needing no tie-breaker.  So in the 3-candidate
>comparison, the stat when comparing any two of {IBCM, MTM,
>Schulze, PC(wv)} is 0-to-0.  If I also counted indecisive
>scenarios (which would be rare in large public elections but not
>in committees) I believe IBCM would show an edge over Schulze in
>the 3-candidate case, due to scenarios where Schulze chooses two
>and IBCM chooses the pairwinner of Schulze's two. (For example,
>the scenario "BC52,CA51,AB51" where Schulze chooses A&B and IBCM
>chooses A.  When the Schulze tie is broken, it may choose B,
>which would count fractionally for IBCM head-to-head.)
>

One problem with your assessment of the relative indecisiveness of Schulze
vs. IBCM is that you are mistakenly assuming that the Schulze winner(s)
simply consist of the initial set of beatpath-undefeated candidates.  This
is not the case, however.  The complete Schulze method was defined by Markus
when he introduced it to EM on 4 Oct 97:

"Step1: Calculate the Smith set and eliminate all those
candidates, who are not in the Smith set.

Step2: Calculate the beat-path-defeats of the remaining
candidates. The winner is that candidate, who defeats every
other candidate via a beat path.
If there is no unique candidate, who defeats every other
candidate via a beat path, then the Schwartz set of
the beat-path-defeats is calculated, those candidates,
who are not in this Schwartz set, are eliminated and
the algorithm is re-started with the remaining candidates."

Therefore, the beatpath calculation needs to be repeated in the event of
ties.  When this is done with your "BC52,CA51,AB51" example, the Schulze
winner is decisively A, not {A,B}.

>So I stand by my conclusions: IBCM & MTM dominate Schulze on the
>head-to-head comparison.  Norm's result is probably based on
>different implementations of the voting methods than I use.  He
>may also count in his stats some differences when methods are
>indecisive, and if he does that then it remains to be seen
>whether he is doing it in a reasonable way.
>
>I hope that if Norm runs his test again (after presumably
>correcting any bugs), he'll post the pairwise table generated by
>his software (or at least the portion for the best methods,
>since the table is large), not just a table of rankings derived
>from it.  That would reveal the magnitude of the head-to-head
>edge, and whether or not the tested methods cycle.
>

I've double-checked the simulation, and you are partially correct.  There
are no bugs, per-se, but due to the limitations of providing a strict
ranking of methods, rather than the pairwise matrix (which is very large),
even tied methods will necessarily appear in a particular order.  It just so
happens that the Schulze methods appear earlier in the unsorted result than
the IBCM methods; since the two methods are exactly tied, the Schulze
methods appear first in the *ranked* output.  So yes, you are correct --
Schulze(wv) and IBCM(wv) always produce the same winner in the
three-candidate case (there are certainly other tied methods in the
3-candidate ranking as well). This is *not* true when comparing Schulze(wv)
and Tideman(wv), however. Schulze(wv) produces better results in cases like
this:

A>C 7:4
B>A 7:4
C>B 8:3

Here, Tideman yields:

Lock CB 8 (C>B)
Lock BA 7 (C>B>A)
- or -
Lock AC 7 (A>C>B)

Therefore the winner is A or C (50% probability of either)

For Schulze, the result is:

A>>B 7:7
A>>C 7:7
C>>B 8:7

Beatpath Undefeated candidates are [A,C], so repeating the beatpath
calculation with these undefeated candidates we get:

A>>C 7:4

So the Schulze winner is A (decisively).

Since the Schulze winner will beat the Tideman winner pairwise in 50% of
these cases, and tie the Tideman winner in the remaining 50%, Schulze will
always outperform Tideman  in the 3-candidate case if there are any ties.
For example, in a 5000 trial sample (3 candidates, 25 voters) I observed:

Schulze(wv) > Tideman(wv) 177:5

Note that Tideman *will* rarely beat Schulze by accident, if both methods
are indecisive.  In that case, the random tiebreaker may choose a different
winner for each method, and Tideman's winner could happen to have a pairwin
against the Schulze winner.  This explains the 5 cases noted above.

>We can infer from Norm's message that he does not agree with
>Markus' objection that simulations based on random voting are
>meaningless.  Markus said that the use of random voting in
>simulations implies that my underlying model of voter behavior
>is that voters behave randomly, but that's not a logical
>conclusion.  People have at various times posted estimates of
>the percentage of scenarios which would have a Condorcet winner,
>using simulations involving randomness, but Markus has not
>criticized those estimates.  I think that the main argument why
>random generation is useful even though voters do not vote
>randomly is that, given a voting method which encourages
>competition (i.e., independence from "similar" alternatives)
>there would presumably be a number of similar viable candidates
>and a number of irrelevant candidates.  Preferences regarding
>similar candidates figure to be more random than votes regarding
>irrelevant candidates.  And there is no need to simulate
>irrelevant candidates.
>

Actually, I would say that simulations of voter behavior based on random
ballots *are* 'meaningless', but this does not mean that random datasets are
*useless*.  Both my 'rankPairwiseMethods' simulation and yours are based on
purely random data.  In your case, random ballots are used, whereas I am
generating pairwise matrices randomly.  By using random ballots, you avoid
the problem of 'impossible' pairwise matrices (that is, a matrix that cannot
be generated from *any* set of ballots), at the cost of having a slower
simulation.  In both cases, these random data sets are useful for testing
criteria compliance and other theories.

I agree with Markus that we cannot meaningfully simulate an electorate using
random data.  In a purely random simulation, the expected value for every
cell in the pairwise matrix will contain v/2 votes, where v is the number of
ballot papers.  A matrix like this contains no information, so trying
to find the Condorcet winner from such a matrix is analogous to trying to
extract a signal from white noise.  Of course, there will always be some
small deviation in each cell from the expected value, and this may be
sufficient information to allow a (uniformly random) Condorcet winner to be
chosen.

On 5 Mar 2000 Bart Ingles wrote:

>Adding a 'random society' component to the model would increase the
>probability of cycles.  It looks like a purely random model reaches 50%
>at around 10 or 11 candidates (it would still take many more to reach
>the figure shown above, though).

Bart didn't give a source, but I think I heard somewhere that this statistic
was from some of Duncan Black's writings.  It occurs to me that the only use
for a statistic like this is that it might possibly establish an absolute
lower-bound on the probability of having a Condorcet winner (that is, for a
specified number of voters and candidates).  Someone with a better grasp of
information theory than I possess would need to check this for sure, but it
seems to me that the probability of having a Condorcet winner is positively
correlated with the amount of order in the pairwise matrix.  If this is
true, then any simulation (or real life, for that matter) that is not purely
random would have a greater likelihood of having a Condorcet winner than the
purely random case, since the pairwise matrix would necessarily be more
ordered.  This type of data provides a good example of how meaningless data
can produce useful results -- provided one does not use the information
incorrectly.  If, for example, Duncan Black was trying to estimate how
serious a problem voting cycles would be in *real* elections, then his model
is so flawed as to make the result quite meaningless.  One needs to use
spatial models or some other technique which assumes some degree of rational
voter behavior in order to make reasonable predictions about the probability
of voting cycles in the real world.

>> Therefore, in order to judge which method produces better
>> results by this metric, it is necessary to know the relative
>> frequency of Smith sets of various sizes that we might expect
>> in an actual election.
>
>Not so, assuming Norm's 3-candidate column is flawed.
>

However, since my 3-candidate results for Schulze and Tideman are *not*
flawed, this frequency analysis is necessary.  It is possible that Tideman
may have an _overall_ advantage over Schulze by this pairwinner comparison
metric, but it is also clear that Schulze will do better in the 3-candidate
case, so it cannot be said that Tideman is *always* better than Schulze in
this regard.  You are right that IBCM is always as good or better than
Schulze, but since that method is no longer under consideration due to
monotonicity violations, it is really only the Schulze & Tideman results
that are of any interest (I guess this depends on one's personal priorities,
but I wouldn't trade monotonicity for better results here, and neither would
you apparently).

On the question of whether Tideman has an overall advantage over Schulze
(due to better performance when Smith Set size is greater than 3), the
simplified frequency analysis I provided for 10-candidate Smith sets implied
an overall advantage for Tideman, but this result is flawed in at least two
ways:

1) In the real world, elections having few (3-5) candidates are much more
likely than elections having many candidates.  I have no way of estimating
these probabilities, but it is obvious that with fewer candidates, Smith set
sizes are limited to the number of candidates running, so the 3-candidate
result is more significant than my 10-candidate results would imply.  For
this reason, in _actual_ elections, we might find that Schulze performs
better than Tideman.

2) Although Schulze outperforms Tideman in the 3-candidate case, the
probability of the two methods yielding different results when there are 3
candidates is very low (it requires a particular kind of pairtie).  As the
number of candidates increases, the probability of the two methods yielding
different winners increases, and it is of course only when the methods
produce different results that this metric is of any importance.  Therefore,
I may also be *overstating* the importance of the 3-candidate case, in which
case the advantage does belong to Tideman after all.

Therefore, at this point, the relative merit of Schulze vs. Tideman
according to the pairwinner comparison cannot be conclusively determined.  A
more refined analysis, together with a large body of empirical data from
actual elections would be needed to decide this matter with certainty.

>-snip-
>> That said, I am not persuaded that this is a particularly
>> important consideration -- if we thought this was an important
>> factor in selecting a method, we would probably choose
>> Smith//Copeland, since its results are even superior to those
>> of IBCM.
>
>Smith//Copeland is the same as Copeland.
>

I'd forgotten that Copeland satisfied the Smith Criterion.  Thanks for
pointing that out (it's not a method we look at very often!).

>I haven't tested Copeland, so I can't verify Norm's surprising
>Copeland result.  Maybe I will find time to add the Copeland
>method to my software's collection.
>
>Since Copeland is so indecisive, and presuming that Copeland
>advocates will specify a reasonably deterministic tie-breaker
>(e.g., Copeland//Plurality, advocated by Bruce Anderson), a
>reasonable comparison requires evaluating Copeland with such a
>tie-breaker.  Norm hasn't specified what his software does when
>Copeland, or any other method, is indecisive.
>

As I mentioned, the Copeland method, along with all the other methods in
this particular simulation simply used a random tiebreaker, since ballots
were unavailable.

>Also, since Copeland neglects the vote counts in each pairing,
>I'd expect it wouldn't do nearly so well in my other head-to-
>head stat which accumulates margins.
>

If by margins you mean the margin by which one method beats another (that
is, how decisive a pairwin one method gets against another method), then I
think you'd be surprised.  I don't think it is at all unlikely that Copeland
dominates all the other methods by this particular metric, since the number
of pairwins a given candidate has against its opponents is probably the best
predictor of whether it will have won any _particular_ contest.

Nonetheless, please add Copeland to your own simulation, if you have time,
and let me know if you get different results.

>However, it's important to clear up a major misunderstanding
>apparently shared by Norm, Markus, and Blake due to my terseness
>about the purpose of the head-to-head comparison:
>
>The head-to-head comparison criterion is one among many
>criteria.  Criteria are not equally important.  I'm sure we
>agree that several criteria are more important than the head-to-
>head comparison: feasibility, anonymity, neutrality, simplicity,
>monotonicity, resolvability, top cycle, independence from
>clones, pareto, etc.
>
>Each reader must judge the relative importance of criteria.  The
>head-to-head comparison can serve to distinguish between methods
>which are not distinguished by any of the criteria the reader
>considers more important than the head-to-head comparison.
>
>Some of us, including me, also consider Mike Ossipoff's
>defensive strategy criteria (Minimal Defense, Truncation
>Resistance, etc.) more important than the head-to-head
>comparison, and more important than some of the other criteria
>listed above.
>

I agree completely with all of the above.  I was merely stating my opinion
on the importance of the 'head-to-head comparison', so that you and others
would understand why it doesn't persuade me to prefer Tideman or IBCM to the
Schulze method.  My comment about Copeland's performance was to make the
same point that you have eloquently stated above -- that we cannot look at
high performance on a particular criteria in isolation, but instead consider
it within the context of many other issues in order to make a proper
assessment (I don't think any of us want to have to advocate Copeland ;-)

>I don't consider Beatpath GMC at all important:
>
>1. I see no justification for disqualifying a candidate who
>would not be disqualified if a few indifferent abstainers decide
>to express their indifference by voting.
>

I agree with you that beatpath GMC may not be as important a criterion as I
once thought.  Mike proposed regular GMC, and then Markus proposed beatpath
GMC in order to provide some useful guarantees and protections for voters in
the face of certain types of offensive strategy.  Since he now claims that
BC is sufficient to make the method sufficiently strategy-proof, I'm willing
to take his word for it -- I consider Mike to be our expert on issues
related to strategy (it's not an issue I pay much attention to).  That said,
it still seems to me that it is better if a method can satisfy GMC unless
some other useful criteria need to be sacrificed in order to get it (it
remains to be seen if there are any...)

>2. I see no reason why we should accord more respect to a
>"majority strength beatpath" abstraction than to the real
>plurality of voters who prefer a candidate rejected by Beatpath
>GMC more than the Schulze winner.  Consider this example:
>
>   Majorities:  BD65,DC64,CB63,AF62,FE61,CF52,EA51,AB49
>   (All other pairings are ties.)
>   MTM & IBCM elect A.  Schulze elects B.
>
>   There is a "majority strength beatpath" from B to A:
>   BDCFEA has strength 51.
>

In this example, you do not provide the total number of voters, or the
ballots used to generate the pairwise matrix.  Assuming there are 100 voters
and this is a 'possible' pairwise matrix, then candidate A cannot be elected
by any method satisfying beatpath GMC.  If however there are fewer than 98
voters, then all of the candidates have a majority beatpath against them, so
any beatpath-GMC compliant method could elect either A or B.

Most methods (PC, SD, Tideman, IBCM, etc.) elect A in this case.  Schulze
and SSD choose candidate B.  This may be necessary to satisfy beatpath GMC,
or it might not -- without the ballots it cannot be determined with
certainty.  Although I agree with you that candidate A is probably the
better choice in this case, the election of candidate B is not indefensible:

Using the logic of SSD, we note that all candidates are members of the
Schwartz set.  Eliminating the weakest defeat (49), we are left with two
cyclic groups of candidates (AEF, BDC).  Since AEF is beaten (52), and BDC
is not, the winner must be chosen from BDC.  Since candidate B is
least-beaten (63) it is the obvious choice from among the 3.

[continued in next message...]