[EM] River method -- updated summary
heitzig-j at web.de
Wed Oct 6 12:37:12 PDT 2004
Here's an updated summary on the River method.
Short definition of River
in its "top-down version" for use with large electorates,
formulated in usual terminology:
1. Allowing abstentions, determine all pairwise defeats. Ties are not
2. Process all defeats by descending strength (=size of supporting
majority), applying whatever tiebreaker (for example, the one used in MAM).
3. Add the defeat as a solid arrow to the diagram, if that would not produce
a) a cycle (until here it is the same as Ranked Pairs)
b) or a branching (that is, two defeats defeating the same option)
4. The option uniquely undefeated at the end wins.
See below for alternative definitions for use in smaller groups.
1. Immunity (every defeat against the winner faces a chain of equally
strong defeats leading back to the winner. Beatpath and Ranked Pairs are
the two other major immune methods). This implies Smith/Schwartz, which
in turn imply Condorcet/Llull (an option defeating all others wins).
Immunity also implies Steve Eppley's strategy criteria.
2. Resoluteness (choice is only used to resolved final ties as in MAM)
3. Monotonicity (replacing an individual preference A>B by A=B or A?B,
or replacing an individual equivalence A=B or undecidedness A?B by A<B
does not make A win or harm B)
4. Independence of Pareto (or strongly) dominated alternatives
(IPDA/ISDA), which are fulfilled by neither Beatpath nor MAM. See below
for a proof.
5. Summability (depends on the pairwise matrix only except for rare
tiebreaking situations. IRV is extremely non-summable)
6. The final result is easily justified by showing the resulting
tree-like diagram of n-1 arrows which shows that any majority argument
against the winner is in conflict with stronger arguments of the same
kind. See below for further improvement. Here's an example of such a
diagram (lines are upward arrows resp. downward defeats, labelled with
their strengths, winner is A):
67| 51| \83
| | \
D E F
7. River fails ICI/INI (like MAM, I guess), the "Secret Preferences
Criterion", and the "Immunity From Squawking" criterion.
8. It is not yet clear whether it fulfils the "Intuitive Loser
Criterion", the "Sincere Expectation Criterion", and the "1st Choice
Criterion", but I hope so :-)
Examples where River differs from Beatpath and Ranked Pairs
With four options and pairwise different strengths of defeats, River
finds the same winner as either Beatpath or Ranked Pairs, except for the
following 5 out of 720 possible situations (up to symmetry). Here, AB>CD
denotes that the defeat A>B is stronger that the defeat C>D.
In these situations, River elects C while Beatpath and Ranked Pairs elect B.
Ranked Pairs and River, majority complaints,
and improving the diagram (River+)
Concerning the set of affirmed majority defeats, the main difference
between Ranked Pairs and River is that Ranked Pairs affirms a maximal
number of large defeats, while River tries to affirm only so much
defeats as are needed to determine a unique winner. The main consequence
of this difference is that the River diagrams are much simpler
(tree-like) and can thus show the process and reasons for the final
decisions easily and efficiently.
So both have the same spirit of resolving cycles of strong defeats
before cycles of weak defeats while all versions of sequential dropping
do the opposite. But the river method lets a strong defeat, say Y>X,
contradict a weaker one, say A>B, only if that strong defeat is relevant
to decide whether X wins or not. When X is already defeated by another,
even stronger defeat, say Z>X, then Y>X is not used and the weaker
defeat A>B gets a chance to decide between A and B. On the contrary,
Ranked Pairs would keep both Z>X and Y>X and drop A>B unnecessarily.
Methods which resolve cycles of weak defeats before cycles of strong
defeats will drop A>B before even looking at the relationship of Y>X and
When, after the decision for X, a majority should complain that they
prefer Y over X, the River diagram easily shows a larger beatpath from X
to Y. However, this beatpath may not be the strongest existing beatpath
from X to Y, hence the complaint would not be rebutted as strongly as it
could be. Although this can happen with Ranked Pairs too, the Ranked
Pairs diagram usually contains more majorities and one can expect some
more "rebutting power" here.
One solution is of course to publish the River (or Ranked Pairs) diagram
showing the process of the method and the reason for the decision,
together with a second diagram showing all *strongest* beatpaths from
the River (or Ranked Pairs) winner to all alternatives. This second
diagram is also tree-like in general and could be called the "beatpath
tree" of the winner. (I wonder whether "beatpath trees" have perhaps
already been discussed in the context of the beatpath method for which
they can of course also be published to rebut majority complaints.)
The problem with two published diagrams, however, would be that they
might erroneously be interpreted as showing two social orderings. This
is because although both trees show the same winner, the combination of
both diagrams could produce cycles. Although this might be more probably
in the Ranked Pairs case where the original diagram shows already more
majorities than a tree, it can also easily happen in the River case.
For the River method, a second possibility to increase its rebutting
power ex post is to incorporate additional information into the River
diagram after the winner has been determined:
1. Construct the River diagram in whatever way (interactively,
semi-interactively, or top-down).
2. process all majorities again by descending magnitude and incorporate
those as dashed arrows into the diagram which do not introduce cycles
(but allowing branchings now).
3. Third, remove every dashed arrow which does not belong to some
strongest shown beatpath from the winner to any alternative.
In other words, perform River to *find* the winner (solid arrows), then
proceed with Ranked Pairs to help *defending* that winner (dashed arrows).
This results in a diagram with at most one solid and one dashed arrow
originating at every non-winner, hence containing at most 2n-2 arrows
for n options, which is still much simpler than a full Ranked Pairs
diagram which in general contains O(n^2) arrows I fear. The solid lines
show the process of the method, the dashed arrows give additional
Of course, one could perform step three of the above definition also to
condense an Ranked Pairs diagram by removing majorities inessential for
defending the winner. However, in contrast to River+, this can result in
removing majorities that were essential for *finding* the winner in the
Further computational evidence that River is
a compromise between Beatpath and Ranked Pairs
To study the "rebutting power" of River+, I performed some simple
simulations. They are not as refined as Steve's simulations in that they
directly generate a random preference matrix instead of simulating
individual voters. I used three parameters: No. of options,
"Uncertainty" U and "Linearity" L.
(Details: For each pair of options Xi and Xj with 1<=i<j<=n
independently, first the fraction of voters with no strict preference
between Xi and Xj (="undecided voters") was drawn uniformly at random
between 0 and U. Then the fraction of "decided voters" preferring Xi to
Xj was drawn uniformly at random between L and 1. In particular, all
majorities were different with probability 1!)
For each of the simulated methods (Beatpath, Ranked Pairs, and River), I
computed the winner and its beatpaths. For the Ranked Pairs and River
winners, also those beatpaths contained in the Ranked Pairs, River, or
River+ diagrams were determined, giving six groups of beatpaths:
(1) All existing beatpaths from the Beatpath winner,
(2) All existing beatpaths from the Ranked Pairs winner,
(3) All existing beatpaths from the River winner,
(4) All beatpaths from the Ranked Pairs winner affirmed by Ranked Pairs,
(5) All beatpaths from the River winner affirmed by River, and
(6) All beatpaths from the River winner affirmed by River+.
For each of those beatpaths, I computed
(a) the strength (= magnitude of its weakest element),
(b) the margin (= strength - no. of voters prefering the alternative),
(c) a "probabilistic strength" (= product of all elements' relative
(d) its length.
In each of the six groups of beatpaths, I computed the minimum and mean
strength and probabilistic strength, and the maximum and mean length.
For all three winners, I additionally computed the number and sum of
defeats, the balance of defeats (= sum of defeats - sum of complaints),
and which winner beat which pairwise. Finally, the sizes of the Smith,
Weakly Immune, and Immune sets were computed, since all three winners
belong to them.
First results indicate that the three methods often find the same winner
and perform very similar judged from the above measures.
Example with 10 options, no uncertainty (that is, no truncated ballots),
and L=0.4 (that is, with some correlation between voters preferences):
In only about 1% of the cases, the River winner differed from both
Beatpath and Ranked Pairs winner, and in only about 0.5% Beatpath and
Ranked Pairs were the same but River was not. In contrast, in about 5%
of the cases, River and Ranked Pairs were the same but Beatpath was not,
and also in about 5% of the cases, River and Beatpath were the same but
Ranked Pairs was not. There were only about 11% of the cases where the
Immune set was not a singleton. That is, in almost *all* cases where
there were more than one immune element, Ranked Pairs and Beatpath chose
different winners. In other words, whenever they are "immunologically
allowed" to choose differently, it seems they almost certainly do...
As for "rebutting power", the picture is unclear. Depending on which
measure you look at, either of Beatpath, Ranked Pairs, and River+
performs slightly better than the two others, where River+ is mainly
either first or second.
Example with 4 or 20 options, and no uncertainty or linearity:
Judging from min. strength or min. margin, we get
Beatpath > River+ > Ranked Pairs.
Judging from mean strength, mean margin, or balance of defeats, we get
River+ > Beatpath > Ranked Pairs.
Judging from mean probabilistic strength, we get
River+ > Ranked Pairs > Beatpath.
Judging from who beats whom, max. length, mean length, or sum of
defeats, we get
Ranked Pairs > River+ > Beatpath.
Judging from number of defeats (= Copeland score), we get either
Ranked Pairs > River+ > Beatpath
Ranked Pairs > Beatpath > River+.
In all, River+ makes a nice compromise between Beatpath and Ranked Pairs
when such performance measures are considered important.
Proof that River[+] fulfils ISDA and IPDA
Def. INDEPENDENCE OF STRONGLY DOMINATED ALTERNATIVES (ISDA):
Removing a strongly dominated alternative must not change the winner.
X is STRONGLY DOMINATED by an alternative Y if
(i) Y beats X
and, for all Z distinct from X,Y:
(ii) if Z beats Y, Z beats X even stronger,
(iii) if Z beats X, Y beats X even stronger,
(iv) if X beats Z, Y beats Z even stronger, and
(v) if Y beats Z, Y beats X even stronger.
In the general case, Pareto-dominated alternatives are also strongly
dominated but not vice versa, hence ISDA is then stronger than IPDA. As
Steve already pointed out for Pareto-dominated alternatives, such
strongly dominated alternatives might be easily be found by a losing
party and be added strategically to change the winner, which should not
River[+] fulfils ISDA.
Proof: Assume X is strongly dominated by Y and its removal changes the
River winner. Then the river diagram must contain arrows
Z---a-->X---b-->Y (a and b being the strengths), hence also Y beats Z
with some strength c, where b>c>a. As River affirms X>Z, it did not
affirm Y>Z before, so it affirms a beatpath from Z to Y even stronger
than c -- a contradiction since also Y>X>Z is affirmed!
(By the way, (ii) is not needed for the proof.)
Interactive version of river
for use in small electorates:
1. Let the deciders sit down in a circle. Using pencil, paper, and
eraser, they will draw a system of "rivers", originating from the
"worst" options and flowing past "better" options towards a common
destination: the "winning" option. Beginning with an empty sheet of
paper and an arbitrary decider, and proceeding in clockwise order, each
decider gets the opportunity to give part of her opinion by telling that
she likes some option X better than some other option Y and then shortly
argueing in favour of X and against Y. After each of these statements,
the system of rivers is changed by first "digging a river bed" between X
and Y and then checking the "course" of the rivers. This is done
according to the following easy rules:
2. Directly after the decider has told his opinion that X be better
than Y, it is counted how many deciders agree in prefering X over Y and
how many disagree with it and prefer Y over X instead. (Some deciders
might also abstain from this partial decision.) If the second number is
not larger than the first, the new river bed allows a flow from X to Y,
otherwise it allows a flow from Y to X instead. In both cases a dashed
arrow between X and Y indicating the direction of the possible flow is
drawn, and the larger of the two numbers is written besides the arrow's
tail. This number is considered the "width" of the river bed between X
and Y. After the new river bed has been "digged" in this way, it is
checked whether the course of the rivers changes as a consequence:
3. It is first checked whether the rivers actually use the new bed: a)
If there is already a river segment starting at the same point as the
new bed, and if its bed has a larger width than the new bed, the new bed
is not used and its arrow is erased. b) If there is already a river in
the system flowing from the point where the new bed ends towards the
point where the new bed starts, and if that part of the river has a bed
which has at all points the same or a larger width than the new bed,
then the new bed is also not used and its arrow is erased. Otherwise,
the rivers do use the new bed and its arrow is drawn as a solid line.
4. When the new bed is indeed used by the rivers, it is then checked
whether perhaps some other segments of the rivers "run dry" or must be
"dammed": a) Each river segment starting at the same point as the new
bed and having a bed of smaller width changes its course into the new
bed, so that their old bed "runs dry" and its arrow is erased. b) As
long as there is any circular flow in the system of rivers, that segment
of the circular flow which has the narrowest bed is "dammed" and its
arrow is also erased from the drawing. (In the rare case that there is
more than one segment having the narrowest bed, all of these are dammed!)
These possible erasures make sure that there do not remain any circular
flows in the system and that the rivers only branch when they find beds
of exactly equal width. (On rare occasions, if the drawing shows
complicated slopes since many arrows have been erased, it might be
useful to redraw the current system of rivers on a new sheet of paper
and choosing better positions for the options.)
5. After each decider has had the opportunity to suggest new river
segments several times in this ways, the system of rivers will finally
be stable, all rivers will end in one and the same point, and no further
changes will be possible. In general, the final destination does not
depend on the order in which statements are made! That common
destination is the winning option, and the final course of the rivers
represents the course of the discussion. The final drawing also shows
easily why the winner can be considered better than all other options,
since from each other option a sequence of arrows leads towards the
winner, each arrow representing a majority opinion none of which can be
contradicted by other majority opinions of equal strength. (However,
even if the final drawing might be one linear flow, it must not be
interpreted as a complete ranking! To determine a 2nd choice winner when
the 1st choice becomes unavailable, the final arrows to the 1st choice
winner must be removed and the discussion must be continued without the
1st choice winner, until the system gets stable again!)
A decider issues a statement of the form "X is better than Y".
Count agreements and disagreements with this statement, but allow
Draw a new "river bed" as a dashed arrow in the direction of the
Annotate the majority's number alongside the new bed as its "width".
Erase the new bed if
a) there is a wider branch at that starting point, or
b) it would become the narrowest segment in some circular flow.
Otherwise redraw the new arrow as a solid line.
Erase each river segment which
a) runs dry since it is at a branching point with a wider branch, or
b) must be dammed as the narrowest segment in some circular flow.
in clockwise order until stable. The common destination of all rivers wins.
Here's an example:
7 times A>C>D>B
6 times B>A>C>D
4 times B>C>D>A
3 times D>B>A>C
2 times D>A>B>C
2 times D>C>A>B
1 time D>B>C>A
Number of deciders ...over...
preferring... A B C D Sum:
A - (11) 18 13 42
B 14 - 16 (10) 40
C (7) (9) - 17 33
D (12) 15 (8) - 35
Selected River Method applications (A>B means an arrow from B to A!):
(i) C>D? 17+ 8- (added) D ---> C
B>C? 16+ 9- (added) D ---> C ---> B
D>B? 15+ 10- (not used)
A>B? 11+ 14- (B>A added) D ---> C ---> B <--- A
A>C? 18+ 7- (added, B>C runs dry) D ---> C ---> A ---> B
D>A? 12+ 13- (A>D not used)
D>B! (added, B>A dammed) B ---> D ---> C ---> A
(ii) B>A? 14+ 11- (added) A ---> B
D>B? 15+ 10- (added) A ---> B ---> D
C>D? 17+ 8- (added) A ---> B ---> D ---> C
A>C? 18+ 7- (added, B>A dammed) B ---> D ---> C ---> A
D>A? 12+ 13- (A>D not used)
B>C? 16+ 9- (not used, A wins)
(iii) A>C? 18+ 7- (added) C ---> A
C>D? 17+ 8- (added) D ---> C ---> A
B>C? 16+ 9- (not used)
D>B? 15+ 10- (added) B ---> D ---> C ---> A
B>A? 14+ 11- (not used)
A>D? 13+ 12- (not used, A wins)
In all three constructions, A wins. This is also the case if Plain
Condorcet, Smith/Condorcet, Sequential Dropping, Schwartz Sequential
Dropping, the Beat Path Method, Borda Counts or Approval (with cut-off
in the middle) were applied. Plurality would elect B instead. Also
Tideman would elect B since it would add B>C and thus drop D>B. In
contrast, the River Method does not introduce B>C since it has already
the stronger A>C to show that C is not best! It seems that IRV would
elect D, but I can't quite believe that...
In order to get a quick solution with the interactive version of the
River Method, it is helpful when X is always among the current
destinations of the system and Y is an option with good chances against
X (this was the case in example (ii)). If we can manage to always choose
Y to be the option with the best chances against X that has not already
been compared with X, there will always be a unique destination from the
beginning, and we will never need to choose a different X than the
current destination. Hence the following version:
Semi-interactive version of river
for use in medium sized electorates:
0. Choose an arbitrary option as the beginning source (and destination).
1. DISCUSS alternatives to the current destination(s).
2. VOTE: By pairwise comparisons, determine the widths of the most
promising river beds that start at (one of) the current destination(s)
and have not been erased before.
3. VERIFY: For each of the widest among these, check whether the rivers
do follow it and draw it in that case.
4. FIX the system as in the interactive version.
5. REPEAT until stable.
In the above example, this could look like this:
Start: A A
B,C,D > A? B 14+ 11- (B>A added) A ---> B
C,D > B? D 15+ 10- (D>B added) A ---> B ---> D
A,C > D? A 13+ 12-
C 17+ 8- (C>D added) A ---> B ---> D ---> C
A,B > C? A 18+ 7-
B 16+ 9- (B>A dammed) B ---> D ---> C ---> A
B,D > A? B! (not used, A wins)
Mathematically speaking, the River Methos finds a special acyclic
subgraph T of the weighted directed graph G of all pairwise majorities.
T is acyclic, and if it contains two edges with the same source, they
have equal weight. I will call this class of subgraphs "almost-trees".
In the "general case" where all weights are different, T is a dual tree
in the usual sense. (Indeed, I could have called this the "Tree Method"
as well, but that would have been confused with "decision trees" too
easily, which are something quite different.)
In the "more general case" where every cycle in G contains a unique
smallest weight, T is in a certain sense optimal among all almost-trees
of G. Sort the weights of all edges in T by descending order and
continue this finite sequence by a sequence of zeroes, calling this the
"weight sequence" of T. It turns out that T has a lexicographically
maximal weight sequence among all almost-trees of G. (The lexicographic
order orders two sequences by the leftmost entry in which they differ.)
In other words, no edge can be added to T without removing some edge of
at least the same weight or introducing a cycle or a branching with
In this more general case, also Tideman's rule finds a subgraph of G
with a lexicographically maximal weight sequence, but it does so among
all acyclic subgraphs. Hence also Tideman's rule could be stated in a
way similar to the interactive version of the River Method. Indeed, we
need only remove steps 3a) and 4a) from the method and get an
interactive version of Tideman's rule. But that would of course be of
little use since the resulting acyclic graphs are in most cases by far
too complicated to be drawn in such an interactive way! In contrast,
(almost-)trees are a very simple structure. The most simple kind of
subgraph which would still guarantee a unique winner in every graph
would be the dual trees in the usual sense, but there is a reason why I
chose almost-trees instead of dual trees: allowing branchings with
equally weighted branches, the method becomes "component consistent" in
a very strong sense, in particular, it becomes immune against cloning.
Also Tideman's rule fulfils this important axiom, while the beat path
method does not.
Possible extensions / modifications
1. Precedence of majorities/defeats by class and approval.
This is a modification I considered to give better strategy proofness
for methods that are defined via an ordering of the pairwise defeats. It
defines the order in which defeats are considered not by using a single
"strength" like winning votes or margins, but by constructing a
"precedence" relation among all defeats (like it is done in many
Let MAG(A>B) designate the number of voters
strictly preferring A to B.
Let CLASS(A>B) designate the largest positive integer
such that mag(A>B) > (k-1)/k.
Let A>B designate the fact that mag(A>B) > mag(B>A).
Let APP(A) be the approval count of A
(as defined by cutoffs on the ballots, e.g.).
Let RVHR(A) be the rank of A in some random voter hierarchy
(for example defined as in MAM).
Let SCORE(A>B) designate the "score-tuple"
Assume that A>B and C>D. Then A>B is said to PRECEDE C>D
IF SCORE(A>B) IS LEXICOGRAPHICALLY LARGER THAN SCORE(C>D).
In other words, A>B precedes C>D if either
(i) class(A>B)>class(C>D), or
(ii) class(A>B)=class(C>D) and app(B)<app(D), or
(iii) class(A>B)=class(C>D), app(B)=app(D), and mag(A>B)>mag(C>D), or
(iv) app(B)=app(D), mag(A>B)=mag(C>D),
and B is below D in the r.v.h., or
(v) B=D, mag(A>B)=mag(C>D), and A is above C in the r.v.h.
This "precedence" first looks "coarsely" at the maginitudes of the
majorities, only considering the "class" of majority (either 1="simple
majority", 2="absolute majority", or 3,4,...="qualified majorities"). It
then looks at approval counts before considering the exact magnitudes.
The formal definition of precedence as a lexicographical ordering by
score-tuples makes it easy to compare with other definitions of
precedence. Standard MAM, for example, uses the score-tuples
Because of the similarity of the construction, I am quite hopeful that
most of Steve's proofs will be adaptable to definitions of "precedence"
with different score-tuples if only the individual entries of the tuple
behave nicely (e.g., monotonic).
Other suggestions of score-tuples:
3. Precedence of majorities/defeats
by aggregated strengths of individual preferences
/ Weighted Pairwise
James' current Weighted Pairwise proposal amounts to defining the
strength of a defeat as the sum of the individual ratings differentials
which are considered to measure the strength of the individual preferences.
A more simple way to take into account the difference between more and
less important individual preferences would be to define the strength of
the pairwise defeat A>B by
the number of voters
who approve of A
but don't approve of B.
This doesn't require the voters to specify ratings but only an approval
cutoff. When no approval cutoff is specified explicitly, the default
cutoff could be between the ranked and the unranked candidates. As this
is equivalent to using the ratings 0 and 100 only, most of the examples
in James' texts still work in this simplified version, so it has the
same nice strategy properties. The resulting method could be considered
a compromise between four excellent methods: Approval, Beatpath, Ranked
Pairs, and Weighted Pairwise.
Deterministic methods who use only magnitudes (or, analogously, margins)
of defeats can easily be "randomized" by first adding some random number
to each magnitude. These numbers could for example be drawn from the
uniform distribution on the interval [0,alpha*n] where n is the number
of voters and alpha>=0 is some parameter where alpha=0 means no
When applying this randomization to an immune method such as River,
Beatpath, or Ranked Pairs, say, one will get a probability
distribution whose support is the deterministic winner of the method
when alpha=0 or the whole Smith set when alpha>1, or some set between
the winner and the Smith set when 0<alpha<1. That is, by choice of alpha
we can adjust how widespread the distribution will be.
Whenever the base method is monotonic, the randomized version will also
be monotonic. Immunity of the (random) outcome is not guaranteed, of
course (since there if often only one immune option), but for immune
base methods and alpha<1, at least the following holds: Any defeat of
strength s against the winner can be countered by a chain of defeats of
strength at least s-alpha*n. The same is true for beatpaths when the
base method is the beatpath method.
Since most methods only use the order and not the actual values of the
magnitudes, one could also replace magnitudes by ranks before adding the
random numbers. This has the advantage that one has a better control
about what can change. For example, when the base method drops weakest
defeats from cycles, and numbers between 0 and 1.5 are added to the
ranks of the defeats, one knows that the randomized method drops only
weakest or second weakest defeats but never drops the strongest defeat
from a cycle. Certain combinatorial considerations suggest that the
following will give a suitable amount of randomization: With equal
probability, add either 0 or n + 3/2 to each rank independently.
To do list
- Do some spacial simulations
- Find more criteria that distinguish between River, Ranked Pairs, and
- Test for the "Intuitive Loser Criterion", the "Sincere Expectation
Criterion", and the "1st Choice Criterion"
- Implement and incorporate River into www.GrouCho.info
More information about the Election-Methods