[EM] (2) “goal of a better election method”

Sat Feb 18 08:05:28 PST 2017

From: Kristofer Munsterhjelm <km_elmet at t-online.de>
Sent: Tuesday, February 14, 2017 9:41 PM
To: steve bosworth; election-methods at lists.electorama.com;
Subject: Re: [EM] “goal of a better election method”

Hi Kristofer,

S: Thank you for your helpful suggestions and carefully detailed analysis of my suggestions.
Again, I look forward to your feedback.

K: I suppose the concern that comes most clearly to mind is that MJ is not
meant to be a cardinal method, whereas using averages introduces a
cardinal (numeric) element.

To rephrase that, all that "plain" MJ knows about is that there's a
common standard among the people where

Excellent is better than Very Good
Very Good is better than Good
Good is better than Acceptable
Acceptable is better than Poor
and Poor is better than Reject.

It doesn't know what the common standard *is*, however.

S:  But of course MJ counts each “grade” given to each candidate by a voter at least as if it was given “honestly” according to each voter’s judgment of the extent to which each candidate comes close to that voter’s idea of an “excellent” candidate for the office.  In this limited B&L sense, each of these “ideas” is “absolute” rather than “relative”, i.e. capable of be defined using a common language.

K:  To use a school grading metaphor, it knows that A is better than B, but
not that
A: 100%-97% of max attainable points
B: 97%-85%
(or what have you).

S:  As I understand it, you are largely correct and it accords with B&L’s presentation.  They clearly say that MJ’s “language measures are …. not cardinal ….” (p.168), but, at the same time, “nor are they merely ordinal” (p.168).  Accordingly, B&L correctly go on to describe how “grades” can become clearly defined by “learning and practice”, e.g. and “experience professor … has a well-developed set of benchmarks that together define absolute evaluations that dominate the relative comparisons” (p.168) of student performance.   In my own experience as a university professor, each teacher was required to publish the different sets of “learning outcomes” that a student would have to display in order to receive any one of the following “grades”:  A, B, C, D, or F.   B&L add that the “same is true of voters: they have seen and learned about able statesmen, presidents and prime ministers …, as well as inept or corrupt officeholders.  They also have clear benchmarks.”  Perhaps B&L would also say that even when some numbers are used as alternative expressions of such linguistically defined grades, this can be an examples of what they say at the top of page 48 of your first reference below:

“ ... there may well be situations where the numbers are at once a common

language and an interval measure: possible examples are those used in evaluating

wines, divers, and figure skaters, where the judges are professionals who have

learned the meanings of the numbers and scales.”

Still, I accept that HMJ departs from B&L’s clear statement that “grades” are “not cardinal because adding them makes no sense” (p.168).  Nevertheless, might it not be argued that HMJ appropriately honors the middle position that its “grades” have as a result of them not being “merely ordinal” as explained above, i.e. HMJ only averages the “grades” to the left of each “median-grade” and including this median-grade.  In this way, might not HMJ still be able to claim that HMJ is less subject to anti-democratic “manipulation” than all the other single-winner methods that sum or average all the citizen’s votes, i.e. not less by MJ’s “about half” but still less?

K:  Suppose that we want to use MJ to determine the student with the best
performance. Then MJ is supposed to work equally well no matter whether
the grading scale is

A: 100%-97%
B: 97%-80%
(... etc)

of if the grading scale is

A: 100%-92%
B: 92%-77%
(... etc)

as long as all classes use the same grading scale (that's the "common
standard" qualification).

If you use averaging as part of an MJ method, then the ordinal
assumption is violated, because taking the mean doesn't just depend on
the relative order (i.e. A is better than B), but also on just how much
(A has a mean score of 98.5%, B has a mean score of 88.5% vs A has a
mean score of 96% and B has a mean score of 84.5%).

S:  Given the above, like B&L I see HMJ’s and MJ’s “grades” as not determined by numbers or percentages but by different sets of “learning outcomes”, qualities, or “benchmarks” that can be defined by a “common language”.

K: As a consequence, the method may produce better results in certain
scenarios (where the common standard is equally spaced), but it trades
that off by potentially producing worse results in other scenarios
(where the common standard is not equally spaced).

Or more simply put: making the scale cardinal, which you need to be able
to calculate means on the votes, adds more assumptions that may not be true.
For more on this particular objection, see B&L's "Election by Majority
Judgment: Experimental Evidence", p 45. and onwards ("Voting by Points
and Summing"). Quoting:

>> But, is it reasonable to use numerical scales in voting? The answer is a resounding
>> no, for several reasons.

S:  But I see that B&L continue with this quote as follows: “But, is it reasonable to use numerical scales in voting? The answer is a resounding no, for several reasons.  First, the numbers mean nothing unless they are defined: proposals to use weights give them no definition. Their only real “meaning” is found in their strategic use.  This induces comparisons, which immediately leads to Arrow’s paradox.”

Firstly, given the above earlier discussion, each of HMJ’s “numbers” would be used only when breaking ties, and each would be “defined”.

Secondly, these numbers would not be on the HMJ ballots used by voters.  Such ballots would seem much less likely to “induce comparisons … leading to Arrow’s pardox”.

K: (My source is
https://1984f707-a-62cb3a1a-s-sites.googlegroups.com/site/ridalaraki/xfiles/ElectionByMajorityJudgment(ExperimentalEvidence)Final.pdf)

>From an MJ point of view, only using an inferred numerical scale for
tiebreaking is surely better than using it everywhere (like in Range).
But the same theoretical arguments apply as soon as you're using a
numerical scale at all.

S: Given the above and while the “spaces” between HMJ’s and MJ’s  6 “grades” cannot claim to be exactly equal, perhaps we can still say that B&L’s electoral experiments and tests in France and elsewhere using MJ, give us a reason to say the following:  In contrast to B&L’s own “majority-guage” and “majority-value” procedures for breaking ties, HMJ’s limited averaging would probably provide us with some additional useful information for breaking ties, i.e. HMJ’s “approximate” averages are likely to reveal a slightly sharper view of the different intensities with which the majority of voters value each candidate as “fit” for the office.?  What do you think?

K: There's also a strategy argument, which could be generally argued like this:

- Either ties of the type where many candidates have the same median are
common or they're not.
- If they're not common, then a mean tiebreak is not going to change the
outcome often, so we can go with Bucklin or MJ's system.

S:  However, if my above suggestion is valid, HMJ’s mean score of each candidates’ grades to the left of her ‘median-grade” and including her median-grade would be better, even if it only makes a difference rarely.

K:  - If they're common, then, since Range is more susceptible to strategy
than MJ, using a mean tiebreak will make HMJ considerably more
susceptible to strategy as well, and so should be avoided.

S:  While HMJ might be slightly more “susceptible to strategy” than MJ, am I correct in thinking it would not be as susceptible as Range?

K: See this slide set by B&L for more on that:
http://igm.univ-mlv.fr/AlgoB/algoperm2012/01Laraki.pdf . In particular

Majority Judgement - Measuring, Ranking and Electing<http://igm.univ-mlv.fr/AlgoB/algoperm2012/01Laraki.pdf>
igm.univ-mlv.fr
Traditional Methods and results Incompatibility Between Electing and Ranking Majority Judgement: Two Applications Majority Judgement Measuring, Ranking and Electing

Majority Judgement - Measuring, Ranking and Electing<http://igm.univ-mlv.fr/AlgoB/algoperm2012/01Laraki.pdf>

igm.univ-mlv.fr

Traditional Methods and results Incompatibility Between Electing and Ranking Majority Judgement: Two Applications Majority Judgement Measuring, Ranking and Electing

they say:

"The unique [social grading function]s that are partially
strategy-proof-in-ranking are the order functions."

This means that the only grade-based voting methods that are partially
strategy-proof in ranking (which they define earlier) are order
functions, which are of the form "return the nth best grade" for some n.
(The median sets n=voters/2) Consequentially, to keep the SGF partially
strategy-proof-in-ranking, the tiebreak should also be an order function
with a different n, which it is in MJ and Bucklin, but not in HMJ.

B&L do not analyze the HMJ variant in itself, but they note:

"The unique aggregation functions that minimize the probability of
effective-manipulability are the middlemost. Point-summing-methods, f_1
and f_n maximize this probability."

which does suggest that using a point-summing method as a tiebreak will
weaken the method more than using another type of method.

S: Yes, partially weaken HMJ with regard to manipulation resistance, but perhaps also giving HMJ the advantage of discovering more exactly the candidate who is most highly valued as “fit” for the office.  Also, HMJ seems to use a tie breaking procedure which would be easier for ordinary voters to understand.  Consequently, perhaps HMJ would still reduce its vulnerability to “manipulation” by about 20% rather than MJ’s ”about half”.  Is this something one could calculate more exactly by using something like B&L’s formula on pages 197-8 (Majority Judgment, 2011, MIT) or the slide 171 on the second source you referred me to ?  I hope you can sharpen my understanding of these calculations.

K: In addition, averaging would make a method no longer
strategy-proof-in-grading. For the same reason that the median (being an
order function) makes MJ resistant to strategy, an order function tie
break makes MJ resistant to strategy in the tiebreak. B&L say:

"The unique strategy-proof-in-grading SGFs are the order functions.

If the mechanism is a point-summing method (the mean with respect to
some parametrization), for almost all profiles, all voters can manipulate."

S:  If so, such  a potential for “manipulation” would seem to be more reduced if my currently proposed HMJ (i.e. HMJ(1)) were modified to become HMJ(2) as follows: The tie breaking procedure to be changed so that only the candidates who had received the “highest median-grade” would be considered as potential winners, i.e. only the “grades” received by these candidates to the left of their common “highest median-grade” and including this “highest median-grade” would be used to calculate their different mean scores of evaluation.  In this way, HMJ(2) would seem to retain the feature you mention in the last sentence of your additional thoughts below:  “only an order function will do for the main scoring prior to any tiebreaker”.  Perhaps this modification would reduce HMJ(2)’s vulnerability to “manipulation” by about 30% rather than  either HMJ’s(1)’s by about 20% or  MJ’s by “about half”.     What do you think?

K: (Last slide on "Strategy in Grading")

This is not to say that HMJ is a *bad* method. I would certainly choose
it if the alternative was, say, IRV or Borda. But using averaging does
undermine B&L's theoretical foundation of MJ (since it is no longer an
ordinal grade method), and if you accept their strategy-resistance
arguments, it also weakens MJ's resistance to strategy.

S:  Given now that we have 2 different formulations of HMJ to consider, as well as MJ and a number of other somewhat attractive single-winner methods, I would very much appreciate receiving your reasons for currently most favoring one of them for electing a president.

 [...]

K:  I forgot to mention this strategy-in-grading problem.

Suppose we're using ordinary MJ, and candidate X's final grade is Good.
Someone who gave X a grade of "Very Good" has no reason to exaggerate to
"Excellent" because his vote is counted equally according to MJ's tie
breaker. This helps prevent the method from becoming Really Expensive
Approval where everybody just votes max or min.

However, if you use averages as a tiebreak, the voter might think: "I'm
reasonably sure X's final grade is going to be Good, but as there may be
other candidates with Good as a final grade as well, I should do my best
to make sure X's average gets as high as possible, which means that I
should vote Excellent instead of Very Good". If enough voters do that,
then the method slides into Approval.

And since we need all the help we can get to keep MJ from becoming
Approval (some Range advocates say that MJ would essentially become
Approval even in its current state), it's best to make this kind of
strategy ineffective. And B&L say that requirement narrows down the only
tiebreakers you can use into order functions -- for the same reason that
only an order function will do for the main scoring prior to any tiebreaker.

S:  Thank you, I look forward to your feedback.

Steve

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20170218/7298626a/attachment-0001.htm>