[EM] Re: thoughts on weighted pairwise

Sat Sep 4 04:04:41 PDT 2004

Jobst,
	This is really great. I'm really glad to get such serious feedback from
you; it's really important to me at this stage.
>
>Well, I don't believe in cardinal utilities, so that will presumable be
>my largest problem with the proposal, but let's see... I admit some
>preferences might be considered more important than others, but it seems
>to me that this is again only a qualitative judgement instead of a
>quantitative one. How could one define the quantitative differences in
>an operational way? But perhaps it would suffice to assume that there
>exists a quasi-order (=reflexive+transitive relation) among the strict
>preferences of a voter...

	Okay. Let's say that my ordinal rankings of three candidates is Dean >
Kerry > Bush. But let's also say that I prefer Kerry to Bush MUCH MUCH
MORE than I prefer Dean to Kerry. (This is a very realistic situation.) I
want a method that enables me to express this. I want to be able to
clearly express the Dean > Kerry preference while minimizing the
possibility that it will help change the result from Kerry to Bush.
	(Likewise, if my preferences were Bush >>>> Kerry > Dean, I would want to
clearly express the Kerry > Dean preference while minimizing the chance
that it would change the result from Bush to Kerry.)
	Also, it's not enough to just say that the Kerry > Bush gap is more
important than the Dean > Kerry gap. I also want to be able to roughly
distinguish between whether it's just a bit more important to me (e.g.
Dean 100 > Kerry 55 > Bush 0), or whether it's MUCH more important to me
(e.g. Dean 100 > Kerry 99.99 > Bush 0). I think that the best way to
represent this is with cardinal values.
	I am not saying that people's cardinal ratings of a candidate will
necessarily be an accurate representation of the perceived utility of that
candidate to them, and hence I am not making any definitive judgements
about the ontological status of personal utility functions. However, I
think that a cardinal ratings ballot can serve as a very rough
approximation of utility, and can be very useful in determining which
defeats have the highest priority for the voters.
	Perhaps it would be a good idea if I didn't use the word "utility" so
much in my proposal, maybe not at all, because it invites this sort of
misunderstanding.  
	Meanwhile, I want a method that is Condorcet efficient, Schwartz
efficient, and more-resistant to strategy than either winning votes or
margins. I believe that weighted pairwise delivers each of these.
>
>I will try to translate the following into the context of quasi-ordered
>preferences instead of cardinal ratings, since I agree that it would be
>desirable to take them into account if they can be determined with some
>accuracy. Still, I'm not sure how one should design ballots then.

	I think that the simplest thing is to do what Chris Benham proposed: to
have voters submit cardinal information, but then interpret the
information in terms of prioritized preferences. That is, from each
cardinal ballot, you should be able to infer both a set of preference
pairs (C>D, D=E, etc.) and an ordering of the preference pairs themselves
(with the possibility of ties, of course). I believe that a cardinal
interface is more intuitive than any other possible way of prioritizing
ordinal preferences.
	Chris's compressing-ranks method is an interesting one, and is worthy
competitor for weighted pairwise in the realm of ordinal-cardinal
Condorcet methods. My main critique of compressing ranks lies on strategic
grounds, that is, it is vulnerable to the same chicken-game
cooperation/defection dilemma as approval voting.
>
>However, if you give one candidate a
>> higher rating than another, then you must also give the higher-rated
>> candidate a higher ranking.
>
>I don't understand this restriction! I would understand the opposite,
>when small differences in the rating need not require a different
>ranking, but why this way?

	There's a good reason. If you rank two candidates as equal, but you give
them different ratings, then you are basically just wasting a portion of
your ratings weight. Why? Because you ranked the candidates as equal, your
ratings differential will not be counted towards the defeat strength of
the pairwise comparison, no matter who wins it. Since there is no
functional reason to let people waste their voting power like this, I
chose to disallow it. The rules governing tied/untied rankings/ratings are
not something I feel strongly about, however. There are a few different
variations which I'm willing to accept here.
>
>(Note that voters who rank B over
>> A, or rank them equally, do not contribute to the weighted magnitude;
>> hence it is never negative.)
>
>But it might be zero, right? 

	Yes, that's right. Zero is the minimum weighted magnitude of a pairwise
defeat. A zero in A's defeat of B results if and only if everyone who
ranked A above B gave them equal ratings.

>But it wouldn't even
>suffice to have metrically scaled utilities for each voter, those
>utilities must even be comparable inter subjects which is another thing
>I doubt... 

	Interpersonal comparison of utilities is not necessary for this method.
Like most other voting methods, weighted pairwise gives everyone's ballot
the same weight, and doesn't try to measure whether one person feels more
strongly about the election overall. What WP does do, is allows
individuals to report which of their own preference rankings are most
important to them.
>
>
>Ranked pairs and beatpath
>> are my preferred choices.
>
>River is mine, of course :-)

	Okay. I've been thinking about for awhile, and as far as I can tell so
far, it looks good. So, I have added it to that part of the proposal, and
will probably add it to other web pages I have done when I get the time.
And speaking of that, have you written a paper or a web site for river
which I can site for it in my WP paper?
	Anyway, yes, I can't think of any concrete reason why river would be
inferior in practice to beatpath or ranked pairs, it does pass an
additional criterion, and the rules do seem to possess a fair internal
justification. I don't know of anyone who has been able to do a really
profound competitive evaluation of the three methods, so unless someone is
able to demonstrate the clear superiority of one, it makes sense to
advocate all three of them. Perhaps none of them have any severe
disadvantages to the other. If there is a significant issue, then I assume
it will be a strategic one, and I haven't yet found or heard of anything
like that myself.
>
>> Additional provisions:

	Let me just be clear that the two "additional provisions" are
non-essential to the weighted pairwise method. I am quite happy to accept
a version of WP that does not include them.
>
>
>Thinking about this, I get the following idea: Consider taking only the
>negative sum of the ratings of the defeated option instead of the
>difference in ratings. This seems to have two advantages: We need no
>longer explain what a difference in rating should mean, and we
>effectively resolve cycles by choosing the option with the largest
>utility (assuming such a thing exists)! Actually, didn't you mention the
>problem of "compressing" preferences? Isn't that a larger temptation
>when differences are used than when absolute values are used? But
>perhaps I'm wrong here...

	I believe so. I don't like Condorcet methods that chose a CW when one
exists but then do something totally different when there is no CW. For
example, Condorcet completed by IRV, Condorcet completed by approval
voting, Condorcet completed by cardinal ratings. Basically, I think that
these methods lack parsimony and are therefore strategically unstable. I
think that the disconnect between the pairwise component and the
completion method is too great. It becomes essentially a matter of
starting over with a new method, which I consider to be undesirable,
because I most prefer pairwise methods.
	Unless I am mistaken, what you are suggesting is equivalent to simply
reverting to cardinal ratings in the event of a majority rule cycle,
although I suppose you are choosing from the Schwartz set. This is not
satisfactory to me at all. Basically, those who feel that their preferred
candidate would have a higher cardinal score than the CW can create a
cycle, which is a relatively broad and easy strategy. And anyway, once you
enter the cardinal ratings phase of the election, the
cooperation/defection dilemma will have a severe distorting effect. No es
bueno.
	So, there may or may not be incentive to compress ratings in weighted
pairwise, but to the extent that it exists, it doesn't lead to
compressions in rankings. Hence the integrity of the pairwise comparison
remains, by contrast to the ordinary cardinal ratings sum method. I sort
of imagine the ordinal aspect of the WP method as more-rigid cell walls
which hold the more-squishy substance of the ordinal aspect together,
prevents it from dissolving into a non-majoritarian mess. Maybe this
metaphor is pretty silly.
>
>> 2. maximize
>
>Ah! Here you implicitly neglect that utilities can be compared between
>voters! When you can rescale them, you cannot add them anymore without
>the sum being a completely meaningless thing! Either the values are on
>an absolute and meaningful scale so that you can add but not rescale
>them, or they are not so that you can rescale but no longer add them...

	As I say above, I'm not claiming that the CR scores reflect a real
utility function. Instead, I am interested in giving voters equal power.
The maximization in scale provision was suggested by Chris Benham. Again,
it's not inextricable from the WP method, but I think that it is probably
helpful, because voters will not have to second-guess as to who will be in
the Schwartz set (union of minimal dominant sets), when formulating their
ratings. The ratings only affect the winnability of candidates in the
Schwartz set anyway, so the point of the provision is to give everyone
equal power in terms of the contest between the different candidates in
the Schwartz set.
	So, the ratings differentials aren't supposed to represent an absolute
value, but rather the relative strength of a voter's own preferences
within a given set of candidates.
>
>> This example deals primarily with the possibility of strategic
>> incursion by the voters who favor candidate C. If sincere votes are
>> cast, A beats both B and C, while C beats B. However, those whose
>> preferences are C>A>B have an opportunity to gain an advantage by
>> insincerely voting C>B>A. 
>
>Only if then the cycle with least winning votes is dropped! 

	Or least margin...

>Not when the
>chances become 1/3 or any other positive number for each since then they
>would get their least preferred outcome with a positive probability
>instead of 0. The latter is the case with ROACC, for example, or when we
>slightly perturb the winning votes strengths of all defeats before
>applying beat path or Tideman or river.

	I already expressed my criticism of ROACC in an earlier post, and I'll
follow up with your reply shortly. I'd rather not confuse the two
threads...
>
>However, such a counter-strategy is fraught
>> with instability, since the different groups of voters have no way of
>> knowing how the other groups will vote until it is too late to make a
>> change.
>
>I think that is exactly why people usually think that counter-strategies
>*are* an effective measure to deter strategic behaviour, since they show
>that by starting this tit-for-tat like behaviour by considering
>strategic voting makes the result quite uncertain instead of preferable!

	I see your point, but I don't want a voting method that pushes / tempts
people towards these sort of highly unstable disequilibria. I think that
we can and should mitigate this particular problem of pairwise methods. 
	There are a few reasons why the prospect for cooperation is not too
promising in this situation. For example the fact that the game does not
repeat very often (four years later, everything is changed: the voters,
the issues, the candidates) and so reprisal is not much of a factor.
Reprisal is also limited by the anonymity of voting.
>
>> Another possibility is for the B>A>C voters to compromise by voting
>> B=A>C. If at least 3 of them do this, the B-->A defeat will be the
>> weakest in the cycle, and will be dropped. The problem with this
>> (counter)-strategy is that voters will not have perfect information
>> before the election. Instead, they will face the possibility that B
>> is a sincere winner (even a sincere Condorcet winner), and that
>> voting B=A>C instead of B>A>C would be to needlessly hand the victory
>> over to A. Hence the B>A>C voters will face a dilemma between voting
>> B>A>C, leaving open an opportunity for the C voters to steal the
>> election, and voting B=A>C, giving up the hope of getting their first
>> choice elected.
>
>Seems you are right here at first. But upon closer inspection that very
>same argument applies for the original strategic voters also: When they
>don't have *quite* accurate information, they must fear that by
>reversing A>B to B>A they will get their worst option B (this can happen
>when only one voter has changed his preferences after the polls)! So
>they take a large risk when trying to elect C in this way...

	Yes, this is a good point. It's true that the strategizers also take a
risk. There is a tension between risk and reward. But if a C>A>B voter's
C>A preference is much stronger than their A>B preference, then reward is
likely to outweigh risk... so the voters can quite rationally start down
the dark path of strategic manipulation. (hehe, sorry about the dramatic
phrasing...) 
	For the strategy to work here, the A>C defeat needs to be weaker than the
C>B defeat; otherwise the newly created B>A>C>B defeat would resolve in
favor of B if not A. In the example I gave, the defeat strengths there are
very close... and therefore yes, without unrealistically good information,
the C voters can project a fairly high probability of their strategy
handing over the election to B, their least favorite.
	It's not hard, however, to concoct examples where the A>C defeat is more
obviously weaker than the C>B defeat. Although those examples generally
require a wider circle of strategizers, the risk of a direct backfire is
substantially reduced.
>
>In many cases, I think that the
>> initial strategy of the C>A>B voters switching to C>B>A wouldn’t even
>> be effective in the first place; hence counter-strategy would be
>> unnecessary. That is, going with the assumption that candidates A and
>> B are relatively similar, 
>
>and what if they are not similar?

	One of the key strategic properties of WP is that it attempts to make it
so that those who have the most incentive to strategically alter the
result will probably be those with the least ability to do so. There are
impossibility theorems that say that strategy is inevitable, but there is
no impossibility theorem that says that strategic ability cannot be
roughly distributed in inverse proportion to strategic incentive.
	I'm currently working on a more formal presentation of my weighted
pairwise proposal, and the biggest challenge I am facing is trying to
express WP's anti-strategic properties in a clear, general, and correct
way. Actually I've been driving myself a bit nuts working on this
problem... I have the intuition of how it works, but finding formal
statements that are tautologically correct is very hard.
	So hopefully I will be able to give you a more satisfying statement later
than I can now.
	But basically, the most severe strategy problem in winning votes /
margins is that supporters of C with preference rankings C>A>B will bury A
(the sincere CW), by voting C>B>A. This can create a fake B>A beat which
overrules the genuine A>C beat, changing the result from A to C. So, in a
way, the C voters are using B as a "weapon" against A.
	In WP, the ability of anyone to advantageously use B as a weapon against
A decreases as A and B are regarded as more similar to one another,
because the B>A defeat gets smaller, and it's chance of overruling the C>A
defeat decreases.
	Now, if A and B are relatively similar candidates, while C is a highly
different candidate such that the weight of the A-C gaps are much larger
than the A-B gaps, it stands to reason that the average C>A voter will be
more ideologically antithetical than if the A-C gaps were similar to or
less than the B-C gaps
	I see two tendencies that will *tend* to converge, although they may not
do so in every case. 
1. As the "difference gap" between a voter's preferred candidate and the
sincere winner grows, the more we will expect that voter to have views
that are highly antithetical to the initial winner. Again, this doesn't
hold always, but it should be a probabilistic trend.
2. As the "difference gap" between a voter's preferred candidate and the
sincere winner grows, the more difficult it will be to overrule that
defeat by fabricating a fake defeat or falsely bolstering a real but weak
defeat. 
	Hence, to some degree, weighted pairwise achieves an anti-strategic
counterbalancing of incentive and ability.
	However, as I said, I need to continue working on the clarity of this
idea.
>
>> Again, A is a Condorcet winner, and C beats B. The C voters cannot
>> change the fact that A beats C. Also, even if they reversed the A>B
>> defeat to make a cycle, they couldn’t do anything to get the A-->C
>> defeat anywhere close to being the weakest in terms of weighted
>> magnitude. Hence, no matter what they do, C won’t win. 
>
>OK. This is also the case when we apply Approval in case of cycles or,
>somewhat more sophisticated, compare the defeats by a hierarchy of
>measures as proposed earlier, e.g. first by class (simple, absolute,
>2/3, 3/4, ...) and then by negative approval (or, if you would, sum of
>individual utilities) of the defeated option. This would avoid
>introducing cardinal utilities here.

	Sorry, I don't know what you're talking about here... let me emphasize
that I don't like the principle of approval voting at all. It fails mutual
majority and is highly vulnerable to the cooperation/defection dilemma...
but aside from the word "approval", I'm lost.
>
>To be more specific, if any
>> defeat has more than 1/3 of the highest possible weighted magnitude,
>> it can’t be dropped.
>
>Is there a standard example for this claim?

	I don't know what you mean by "standard example". The real question is
whether I have logical proof. Actually, no, not only could I not prove it,
but it was incorrect as written. I needed to modify this claim... looks
like you read an earlier version. The number 1/3 as a limit to droppable
defeats only applies to dropping defeats in three candidate cycles. In
larger cycles, larger defeats can conceivably be dropped. I have been
working very hard for the last week trying to understand the full
implications of this, but it's rough going. I still believe that my claim
(a partial counterbalancing of incentive against ability) has validity,
but the possibility of larger cycles makes it harder to quantify and
precisely define. I'm still working on it!
>
>As far as I can see, you are right in having shown a good resistance to
>some strategies!

	Have I? Excellent.
	I intend to go on working on this, trying to demonstrate that greater
strategic resistance is needed in pairwise methods, and that WP can
provide an appropriate level of strategic resistance.
>
>Have you tried to determine some more theoretical features of the
>weighted pairwise (class of) method(s)? Most properties of the "base"
>method should be inherited, it seems...
>
	Yes, I'm working on that. Perhaps you can help me? : )
	I'm thinking that Pareto is no problem.
	Monotonicity: I'm pretty sure of it without the maximization in scale
provision. With it, there might be a slight loss of monotonicity, although
I haven't put an example together yet.
	Resolvability: I haven't even thought of a tiebreaking procedure yet, but
I suppose that I should. I'd have to imagine that a halfway-decent
tiebreaker method would give overall resolvability.
	Independence of clones: Probably yes, if river, beatpath, or ranked pairs
is used as the base(?) method.
	Reversal symmetry: I don't see why not.
	Schwartz efficiency: Yes, if river, beatpath, or ranked pairs is used.
	Any others you think I should add here?

	In general, I'm trying to take a more general / less anecdotal approach
in the new version of the proposal... so, yes, that's me trying to get
more theoretical.

my best,
James