[EM] Underneath the hood: Controversy over the "best" voting system.
Abd ul-Rahman Lomax
abd at lomaxdesign.com
Mon May 3 11:04:09 PDT 2010
For years, voting systems were studied through the use of "criteria,"
standards which a system either passes or fails. These criteria often
assumed a preference order, sometimes assuming that this preference
order exists outside what is expressed on the ballot. Arrow's
accomplishment was in showing that a simple set of criteria, that
were intuitively satisfying as fair, could not simultaneously be
satisfied by any voting system (though he wasn't really writing about
voting systems, but about finding a social preference order from a
set of individual preference orders).
However, Arrow neglected utility, and there is an obvious method for
amalgamating individual *utilities* into a social preference order,
based on the sum of utilities. Arrow knew he was doing this, and his
argument against using utilities was that he could imagine no
practical method of finding commensurable utilities. My suspicion is
that biology knew better, and developed decision-making methods for
the human brain that involve, in effect, comparing utilities (on an
attraction/aversion scale) and, in the end, using a
sum-of-reaction-strengths to decide between options.
Be that as it may, here is an approach for comparing voting systems,
in terms of theoretical performance:
Given a set of absolute individual utilities for the universe of
possible candidates, on a scale from maximum attraction to maximum
aversion, the scale linearized so that the same step size anywhere in
the scale represents the same effect on individual choice in terms of
effort that the individual would dedicate to achieving that
improvement, we can assume that different individuals will have a
different overall range of utilities. A depressed individual, caring
nothing about the world, may have a very small range of absolute
utilities, a highly motivated and engaged individual may have high
absolute preference strength. We will start by assuming that the
linearized scale is expanded or contracted so that the individual's
overall range represents their absolute motivation between the extremes.
There is then a clear possibility for a standard from which to
determine a social preference order: Candidates are ranked in order
of their sum of individual utilities on this absolute scale, across
all participants.
There is no question but that collecting such absolute utility data
is difficult or impossible. However, it can be approached, but more
on that later.
There is an immediate application for this, suggested by the work of
Warren Smith in his simulations (IEVS). I will suggest a "voting
system" as a candidate to be ideal. The voting system will use as
input the absolute individual profiles.
"No election," i.e., the choice to repeat the process possibly with
new candidates, investigation of candidates, etc., must be one of the
"candidates" rated.
The bot will amalgamate them in two ways: it will order them in terms
of absolute utility sums, finding the "absolute winner." And it will
then amalgamate them, using a "strategic voting robot," where each
individual, having "voted" the true absolute utilities, sees their
vote optimized into a Range vote of maximum strength, according to
the bot's knowledge of all other votes. The bot is thus a perfect
strategic "advisor," knowing all other votes. But since all other
votes may affect the bot's vote, the bot must approach this
iteratively, treating all "voters" the same at each stage. Thus each
"round" of bot voting is based on the results of the previous round.
The bot can use normalized Range ballots, since the only information
it needs is relative utility differences, it will be working for each
voter to maximize the voter's expected outcome.
I will not describe the bot's detailed operation at this time, beyond
saying that it will iterate using the results of the previous round
to determine votes with maximum strategic power, always participating
in the determination of social preference order with maximally
effective votes, according to the best strategy definable from the
previous round. The criterion for "best" is maximization of expected
individual utility, this bot begins with first preference information
to determine winner(1), and then uses this information to determine
the optimal vote for all voters in subsequent rounds, defining
"optimum" as being most likely to influence the eventual outcome,
assuming that the preference sequence determined overall, absent this
voter's vote, is as shown in the previous round, but that it may
shift in the next round toward a tie thus making the voter's vote
effective. The voter's vote for a candidate will be reduced from the
normally optimal full-approval vote if it shifts the overall
preference order as to higher preferences. The bot satisfies
later-no-harm, casting the maximum fractional vote that does not harm
the overall preference order.
Thus if the voter's previously voted preference, by the bot, slips
below the preference position of the additional approval being added,
due to the additional vote for the individual, the vote for all such
voters is reduced to the factional value necessary to avoid this
slip. To avoid problems with roundoff, the bot will be allowed to
vote, for preliminary purposes, the exact fractional vote that
creates a tie. Later examination will be performed to see if reducing
this vote by a minimum increment will change the result. At any
point, if removing or reducing a previously-cast vote improves the
outcome for the individual, it is removed or reduced.
(The bot votes for all "preference groups" as if they were a
coordinated block, following its algorithm).
So for each collection of preference profiles, there is determined an
"absolute winner" and a "strategic winner," who may be different. The
strategic winner is the candidate who will win if all of his or her
relative supporters vote to maximize personal utility.
There is one additional consideration. A "voting likelihood standard"
will be set, being an level of absolute utility below which a voter
will not bother to participate in an election, or, alternatively, a
formula is developed for the likelihood of such participation, and an
"increment threshold" will be set, below which the voter and the bot
will not act to improve outcome, there being no real preference.
To use this to test voting systems, a set of reference profiles
(collection of absolute utilities for a set of individuals) should be
developed, probably based on normal distributions in issue space
distance. The development of this set of reference profiles should be
based purely on preference and utility theory. It should be possible
to predict the voting in some voting systems based on these profiles
and data about the election, which would be a test of a set of
preference profiles.
The set of reference profiles, if it is modified to include less
likely profiles, i.e., if the probability of a profile is lower for
some (for efficiency), each profile will have relative probability
information attached to it. Again, I won't describe the details, and
they are better left for those doing this work.
Then, to test a voting system, the system would be applied as
"sincere" votes according to the information collected by the system,
first. The "sincere winner" is then determined. With systems that
consider preference strength, the absolute preference strength is
used to determine the "absolute sincere winner," and it is normalized
to determine the "normalized sincere winner," where the vote is
normalized to the full range, and the "strategic winner" where the
vote is optimized by bot.
Regret is defined as the absolute utility difference between an
"absolute winner" and the result in the categories enumerated, and
with each of these, the distribution is given, i.e., in how many
elections out of the reference set did this difference arise, as
modified by the probability data for the reference set, if it is not
even across the profiles. Thus, for each system under test, the
"average regret" is determined, plus the variation and distribution.
When the system fails to find the absolute winner, how much loss of
utility was there? How likely was this, in a series of, say, 1000 elections?
Performance of each system under common voting system criteria is
also determined, and, particularly with regard to criteria that
require a particular winner from the base preference profiles. How
often did that winner prevail? How often did that winner lose? And
what was the average regret in those cases? (With a Condorcet winner,
the regret can be negative, i.e., the utility is improved by the
Condorcet winner being beat by a sincere Range winner.)
There is a lot of work to be done to develop methods of comparing
voting systems. The *goal* of voting systems should be considered.
The approach described here assumes a value to maximizing absolute
utility. There are situations where absolute utility can actually be
determined, such as where there is a common medium of exchange and
equalized relationships to that medium. But by working with
simulations, the need to determine absolute utility is avoided. We
are not proposing the use of absolute utility ratings, for example,
in real elections. Rather, the performance of a voting system given
absolute ratings as a measure guiding voting and voting strategy is considered.
In addition to what is described above, realistic models of voter
behavior as to strategy, where the bot optimizing the vote is not
available, may be of use as well. On the other hand, a multi-round
system, in the primary, with "None of the above, other than those
I've voted for" as an option, can be used, which ties the approval
cutoff to a real-world and equalized measure, the preference for a
candidate over holding a runoff, would be interesting. Simulating a
runoff is impossible because the voter set will be different, but if
we start with simulated preference profiles for the entire
population, we can study what will happen if the preference profiles
don't change (simulating the effect on turnout), but we know that one
of the major arguments for repeated ballot is that voters gain new
information. We can simulate this by incorporating an "ignorance
factor," which makes the voter utility profile murky in the primary
for some voters. Then this "murkiness" is removed in considering
turnout for the runoff and how voters will vote there.
Without, however, some agreement on the purpose of voting, and what
goals are appropriate for it, we will find it continually difficult
to agree upon the relative performance of voting systems. A
particular criterion failure, for example, may be so rare, and/or so
low in damage to the voters involved, that it is negligible, even
when it *looks* horrible. This is common in consideration of Range
voting, for example, where supposedly, of 100 voters, 99 vote A,100,
B,99, and one voter votes A,0, B,100. It's claimed that the A voters
will be outraged that their favorite lost, but their votes indicate
that they really didn't care! Where this argument makes sense is if
there was an unsupported clone, C, who was rated 0 by all the A
voters, and they only voted B 99 because they were worried that C might win.
What this boils down to is an argument that 99% of voters were
freaked out by a no-hope C. No system can perform well if 99% of
voters are totally ignorant of the real situation! In Bucklin, of
course, this problem would not arise, and Bucklin has the reverse
problem: if we assume that the utilities are sincere, the utility
mazimizer, B, loses. But this is a very close election, as stated.
The loss of overall utility is very low. In a real group, meeting
personally, using Range for a quick assessment, and if I were the
chair, I know what I'd do. I'd inform the meeting of the Range
results, that there was a conflict between the Majority criterion and
the range result. Depending on context, I'd suggest a motion. But it
doesn't much matter what that motion would be, because it could be
amended by a majority, and quickly. I'd certainly allow the B
supporter to present the reasons for his or her vote. And then the
majority would decide the election. I have known elections where a
supermajority voted one way, then saw a minority report, and a
supermajority changed its decision. If the situation were really such
that the A voters had no true preference, as the Range votes seem to
indicate, they might well stand aside, given a reasonable argument
from the B supporter. But if, on the other hand, the votes of 99 for
B were artificially high due to the presence of C, and the voters,
now knowing that C wasn't a real option, weren't willing to support B
any more, A would win. It would only take one voter!
I do not see public elections going so far to maximize utility that
they would produce a result like this. It's purely a straw man,
invented by those with reasons to propose a problem with Range
Voting, as, for example, Saari.
But, of course, Borda, Saari's favorite, really does have the same
"problem." Just make it 101 candidates, with 99 of them being truly
awful, and with all the voters preferring A and B to them except one,
the B supporter, who reverses the preference, putting B on top and A
on the bottom. So, converting the Borda votes to the Range ones of
0-100, and assuming that all voters rank all candidates, we end up
with the same votes in Borda as in Range, except that now we really
do suspect B of voting strategically, to bury A. B wins. The "sin" of
Range here is having so many ratings, but use Borda and Range with
the same resolution, in an election with a smaller number of
candidates, the same phenomena can be shown.
More information about the Election-Methods
mailing list